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HUCHORDIN AND USES THEREOF 

Summary of the Invention 

The invention relates to the discovery and 
characterization of a new human gene, huchordin, and 
5 huchordin polypeptides. Northern blot analysis of 
huchordin mRNA reveals that the huchordin gene is 
expressed as an approximately 7.5 kb transcript in adult 
and fetal liver and as an approximately 4.4 kb transcript 
in adult brain, heart, and pancreas. An additional 
xo approximately 2.7 kb transcript is observed in fetal 
liver. 

A cDNA corresponding to huchordin has been cloned 
(SEQ ID NO : 1) . Nucleotides 1 to 2601 (SEQ ID NO: 3) of 
this cDNA encode an 867 amino acid protein (SEQ ID NO: 2) 

15 that has homology to Xenopus chordin (Sasai et al . , Cell 
79:779, 1994) . 

The invention encompasses nucleic acids that have a 
sequence that is substantially identical to a huchordin 
nucleic acid sequence. A nucleic acid which is 
20 substantially identical to a given reference nucleic acid 
molecule is hereby defined as a nucleic acid having a 
sequence that has at least 85%, preferably 90%, and more 
preferably 95%, 98%, 95% or more identity to the sequence 
of the given reference nucleic acid molecule, e.g., the 
25 nucleic acid sequence of SEQ ID NO:l. 

A polypeptide or nucleic acid molecule which is 
"substantially identical" to a given reference 
polypeptide or nucleic acid molecule is a polypeptide or 
nucleic acid molecule having a sequence that has at least 
30 85%, preferably 90%, and more preferably 95%, 98%, 99% or 

more identity to the sequence of the given reference 
polypeptide sequence or nucleic acid molecule, e.g., the 
polypeptide sequence of SEQ ID NO: 2 or the nucleic acid 
sequence of SEQ ID NO:l. 
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The nucleic acid molecules of the invention can be 
inserted into vectors, described below, which will 
facilitate expression of the gene. The nucleic acid 
molecules and polypeptides of the invention can be used 
5 directly as diagnostic or therapeutic agents, or (in the 
case of a polypeptide) can be used to generate antibodies 
that, in turn, are therapeutically useful. Accordingly, 
expression vectors containing the nucleic acid molecules 
of the invention, cells transfected with these vectors, 

10 the polypeptides expressed by these vectors, and 
antibodies generated against either the entire 
polypeptide or an antigenic fragment thereof are among 
the preferred embodiments. 

A transformed cell is any cell into which (or into 
is an ancestor of which) has been introduced, by means of 
recombinant DNA techniques, a nucleic acid molecule 
encoding a polypeptide of the invention (e.g., a 
huchordin polypeptide) . 

An isolated nucleic acid molecule is a nucleic acid 
20 molecule that is, separated from the 5' and 3' coding 

sequences with which it is immediately contiguous in the 
naturally occurring genome of an organism. Isolated 
nucleic acid molecules include nucleic acid molecule 
which are not naturally occurring, e.g., nucleic acid 
25 molecules created by recombinant DNA techniques. 

Nucleic acid molecules include both RNA and DNA, 
including cDNA, genomic DNA, and synthetic (e.g., 
chemically synthesized) DNA. Where single-stranded, the 
nucleic acid molecule may be a sense strand or an 
30 antisense strand. 

The invention also encompasses nucleic acid 
molecules that hybridize, preferably under stringent 
conditions, to a nucleic acid molecule encoding a 
huchordin polypeptide (e.g., a nucleic acid molecule 
35 having the sequence shown in SEQ ID NO:l, a nucleic acid 
molecule (SEQ ID NO: 3) having the sequence of the 
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huchordin encoding portion of the sequence of SEQ ID 
N0:1), or a nucleic acid molecule having the sequence of 
the protein coding portion of ATCC deposit No. 98481. 
Preferably the hybridizing nucleic acid molecule consists 
5 of 400, more preferably ,200 nucleotides. Preferred 
hybridizing nucleic acid molecules have a biological 
activity possessed by huchordin. 

The invention also features substantially pure or 
isolated huchordin polypeptides, including those that 
10 correspond to various functional domains of huchordin, or 
fragments thereof. The polypeptides of the invention 
encompass amino acid sequences that are substantially 
identical to the amino acid sequence shown in Fig. 1 
(SEQ ID NO: 2) . 

15 The polypeptides of the invention can also be 

chemically synthesized, or they can be purified from 
tissues in which they are naturally expressed, according 
to standard biochemical methods of purification. 

Also included in the invention are functional 
20 polypeptides which possess one or more of the biological 
functions or activities of huchordin. These functions 
include the ability to bind some or all of the proteins 
which normally bind to huchordin. A functional 
polypeptide is also considered within the scope of the 
.25 invention if it serves as an antigen for production of 
antibodies that specifically bind to huchordin. In many 
cases, functional polypeptides retain one or more domains 
present in the naturally-occurring form of the 
polypeptide . 

30 The functional polypeptides may contain a primary 

amino acid sequence that has been modified from those 
disclosed herein. Preferably these modifications consist 
of conservative amino acid substitutions, as described 
herein. 

The terms "protein" and "polypeptide" are used 
herein to describe any chain of amino, acids, regardless 



35 
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of length or post-translational modification (for 
example, glycosylation or phosphorylation) . Thus, the 
term "huchordin polypeptides" includes full-length, 
naturally occurring huchordin protein, as well a 
5 recombinantly or synthetically produced polypeptide that 
corresponds to a full-length, naturally occurring 
huchordin protein or to particular domains or portions of 
a naturally occurring protein. The term also encompasses 
mature huchordin which has an added amino-terminal 
io methionine (useful for expression in prokaryotic cells) . 

The term "purified" as used herein refers to a 
nucleic acid or peptide that is substantially free of 
cellular material, viral material, or culture medium when 
produced by recombinant DNA techniques, or chemical 
15 precursors or other chemicals when chemically . 
synthesized. 

Polypeptides or other compounds of interest are said 
to be "substantially pure" when they are within 
preparations that are at least 60% by weight (dry weight) 
20 the compound of interest. Preferably, the preparation is 
at least 75%, more preferably at least 90%, and- most 
preferably at least 99%, by weight the compound of 
interest . Purity can be measured by any appropriate 
standard method, for example, by column chromatography, 

25 polyacrylamide gel electrophoresis, or HPLC analysis. 

Where a particular polypeptide or nucleic acid 
molecule is said to have a specific percent identity to a 
reference polypeptide or nucleic acid molecule of a 
defined length, the percent identity is relative to the 
30 reference polypeptide or nucleic acid molecule. Thus, a 
peptide that is 50% identical to a reference polypeptide 
that is 100 amino acids long can be a 50 amino acid 
polypeptide that is completely identical to a 50 amino 
acid long portion of the reference polypeptide. It might 
35 also be a 100 amino acid long polypeptide which is 50% 
identical to the reference polypeptide over its entire 
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length. Of course, many other polypeptides will meet the 
same criteria. The same rule applies for nucleic acid 
molecules . 

For polypeptides, the length of the reference 
5 polypeptide sequence will generally be at least 16 amino 
acids, preferably at least 20 amino acids, more 
preferably at least 25 amino acids, and most preferably 
35 amino acids, 50 amino acids, or 100 amino acids. For 
nucleic acids, the length of the reference nucleic acid 
10 sequence will generally be at least 50 nucleotides, 
preferably at least 60 nucleotides, more preferably at 
least 75 nucleotides, and most preferably 100 nucleotides 
or 300 nucleotides. 

In the case of polypeptide sequences which are less 
15 than 100% identical to a reference sequence, the non- 
identical positions are preferably, but not necessarily, 
conservative substitutions for the reference sequence. 
Conservative substitutions typically include 
substitutions within the following groups: glycine and 
20 alanine; valine, isoleucine, and leucine; aspartic acid 
and glutamic acid; asparagine and glutamine; serine and 
threonine; lysine and arginine; and phenylalanine and 
tyrosine. 

Sequence identity can be measured using sequence 
25 analysis software (for example, the Sequence Analysis 
Software Package of the Genetics Computer Group, 
University of Wisconsin Biotechnology Center, 1710 
University Avenue, Madison, WI 53705), with the default 
parameters as specified therein. 

30 The invention also features antibodies, e.g., 

monoclonal, polyclonal, and engineered antibodies, which 
specifically bind huchordin. By "specifically binds" is 
meant an antibody that recognizes and binds to a 
particular antigen, e.g., a huchordin polypeptide of the 
35 invention, but which does not substantially recognize or 
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bind to other molecules in a sample, e.g., a biological 
sample, which includes huchordin. 

The invention also features antagonists and agonists 
of huchordin that can inhibit or enhance, respectively, 
s one or more of the biological activities of huchordin. 
Suitable antagonists can include small molecules (i.e., 
molecules with a molecular weight below about 500 ) , large 
molecules (i.e., molecules with a molecular weight above 
about 500 ) , antibodies that bind and "neutralize" 

10 huchordin (as described below) , polypeptides which 

compete with a native form of huchordin for binding to a 
protein, e.g., a member of the TGF-0 superfamily, and 
nucleic acid molecules that interfere with transcription 
of huchordin (for example, antisense nucleic acid 
15 molecules and ribozymes) . Agonists of huchordin also 
include small and large molecules, and antibodies other 
than neutralizing antibodies. 

The invention also features molecules which can 
increase or decrease the expression of huchordin (e.g., 

20 by influencing transcription or translation) . Small 

molecules (i.e., molecules with a molecular weight below 
about 500), large molecules (i.e., molecules with a 
molecular weight above about 500 ), and nucleic acid 
molecules that can be used to inhibit the expression of 
25 huchordin (for example, antisense and ribozyme molecules) 
or to enhance the expression of huchordin (for example, 
molecules that bind to a huchordin transcription 
regulatory sequence and increase huchordin 
transcription) . 

30 The invention also features molecules which alter 

the cellular localization of huchordin. Such molecules 
can be used to treat disorders associated with aberrant 
cellular localization of huchordin. 

In addition, the invention features substantially 
35 pure polypeptides that functionally interact with 
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huchordin, e.g., novel members of the TGF-/3 superfamily, 
and the nucleic acid molecules that encode them. 

The invention encompasses methods for treating 
disorders associated with aberrant expression, activity 
5 or localization of huchordin. Thus, the invention 

includes methods for treating disorders associated with 
excessive expression or activity of huchordin. Such 
methods entail administering a compound which decreases 
the expression or activity of huchordin. The invention 
10 also includes methods for treating disorders associated 
with insufficient expression or activity of huchordin. 
These methods entail administering a compound which 
increases the expression or activity of huchordin. 

The invention also features methods for detecting a 
15 huchordin polypeptide. Such methods include: obtaining a 
biological sample; contacting the sample with an antibody 
that specifically binds huchordin under conditions which 
permit specific binding; and detecting any antibody- 
huchordin complexes formed. 

20 In addition, the present invention encompasses 

methods and compositions for the diagnostic evaluation, 
typing, and prognosis of disorders associated with 
inappropriate expression or activity of huchordin. For 
example, the nucleic acid molecules of the invention can 
25 be used as diagnostic hybridization probes to detect, for 
example, inappropriate expression of huchordin or 
mutations in the huchordin gene. Such methods may be 
used to classify cells by the level of huchordin 
expression. 

30 Thus, the invention features a method for diagnosing 

a disorder associated with aberrant activity of 
huchordin, the method including obtaining a biological 
sample from a patient and measuring huchordin activity in 
the biological sample, wherein increased or decreased 
35 huchordin activity in the biological sample compared to a 
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control indicates that the patient suffers from a 
disorder associated with aberrant activity of huchordin. 

The present invention further provides for 
diagnostic kits for the practice of such methods. 

5 The invention features methods of identifying 

compounds that modulate the expression or activity of 
huchordin by assessing the expression or activity of 
huchordin in the presence and absence of a selected 
compound. A difference in the level of expression or 
io activity of huchordin in the presence and absence of the 
selected compound indicates that the selected compound is 
capable of modulating expression or activity or 
huchordin. Expression can be assessed either at the 
level of gene expression (e.g., by measuring mRNA) or 
is protein expression by techniques that are well known to 
skilled artisans. The activity of huchordin can be 
assessed functionally. 

Also included in the invention are: a method for 
detecting huchordin in a sample, the method including: 

20 (a) obtaining a biological sample; 

(b) contacting the biological sample with an 
antibody that specifically binds huchordin under 
conditions that allow the formation of huchordin -antibody 
complexes; and 

25 (c) detecting the complexes, if any, as an 

indication of the presence of huchordin in the sample. 

In another aspect, the invention features a method 
of identifying a compound that modulates the activity of 
huchordin, the method including comparing the level of 
30 activity of huchordin in a cell in the presence and 

absence of a selected compound, wherein a difference in 
the level of activity in the presence and absence of the 
selected compound indicates that the selected compound 
modulates the activity of huchordin. 

35 The invention also features a method for diagnosing 

a disorder associated with aberrant expression of 
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huchordin, the method including obtaining a biological 
sample from a patient and measuring huchordin expression 
in the biological sample, wherein increased or decreased 
huchordin expression in the biological sample compared to 
5 a control indicates that the patient suffers from a 
disorder associated with aberrant expression of 
huchordin. 

In another aspect the invention features a method 
for diagnosing a disorder associated with aberrant 
10 activity of huchordin, the method including obtaining a 
biological sample from a patient and measuring huchordin 
activity in the biological sample, wherein increased or 
decreased huchordin activity in the biological sample 
compared to a control indicates that the patient suffers 
15 from a disorder associated with aberrant activity of 
huchordin. 

The preferred methods and materials are described 
below in examples which are meant to illustrate, not 
limit, the invention. Skilled artisans will recognize 
20 methods and materials that are similar or equivalent to 
those described herein, and that can be used in the 
practice or testing of the present invention. 

Unless otherwise defined, all technical and 
scientific terms used herein have the same meaning as 
25 commonly understood by one of ordinary skill in the art 
to which this invention belongs. Although methods and 
materials similar or equivalent to those described herein 
can be used in the practice or testing of the present 
invention, the preferred methods and materials are 
30 described herein. All publications, patent applications, 
patents, and other references mentioned herein are 
incorporated by reference in their entirety. In the case 
of conflict, the present specification, including 
definitions, will. control . In addition, the materials, 

35 methods, and examples are illustrative only and are not 
intended to be limiting. 
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Other features and advantages of the invention will 
be apparent from the detailed description, and from the 
claims . 



Brief Description of the Drawing 
5 Figure 1 is a depiction of the sequence of a cDNA 

encoding huchordin (SEQ ID NO:l) and the deduced amino 
sequence (SEQ ID NO:2) of huchordin. 

Figure 2 is an alignment of a portion of the amino 
acid sequence of huchordin (upper sequence of each pair) 

10 and a portion of amino acid sequence of Xenopus chordin 
(lower sequence of each pair; SEQ ID NO:4) . 

Detailed Description 

Huchordin, a human protein described here for the 
first time, is a 867 amino acid protein that is predicted 
is to be a secreted protein. A putative signal sequence 
encompasses amino acids 1-26 of huchordin. 

Huchordin bears homology to Xenopus chordin (Sasai 
et al., Cell 79:779, 1994). Like Xenopus chordin, 
huchordin includes several cysteine-rich repeats. 

20 Xenopus chordin includes four such repeats (Rl, R2, R3, 
and R4) of 58-74 residues (Sasai et al . , Cell 79:779, 

1994) each of which includes 10 cysteine residues at 
cdnserved positions. 

Huchordin contains three intact cysteine-rich 
25 repeats (amino acids 51-125; amino acids 696-762; and 
amino acids 784-844), corresponding to Rl, R3 , and R4 of 
chordin. The huchordin cysteine -rich repeat (amino acids 
644-674) corresponding to R2 of chordin contains only six 
of the 10 conserved cys residues and is properly 
30 considered a half repeat. 

Four potential N-glycosylation sites (217, 351, 365, 
and 434) are located between Rl and R2 ,in huchordin. 
Chordin also has four such sites. Two of the potential 
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huchordin N-glycosylation sites N351 at N434 are in 
positions that are conserved in chcrdin. 

Overall, the huchordin gene described herein has 66% 
homology at the nucleotide level to the Xenopus chordin 
5 gene, and the huchordin protein described herein has 53% 
homology to Xenopus chordin protein at the amino acid 
level . 

Huchordin Nucleic Acid Molecules 

The huchordin nucleic acid molecules of the 
10 invention can be cDNA, genomic DNA, synthetic DNA, or 

RNA, and can be double -stranded or single-stranded (i.e., 
either a sense or an antisense strand) . Fragments of 
these molecules are also considered within the scope of 
the invention, and can be produced, for example, by the 
15 polymerase chain reaction (PCR) or generated by treatment 
with one or more restriction endonucleases. A 
ribonucleic acid (RNA) molecule encoding huchordin can be 
produced by in vitro transcription. 

The nucleic acid molecules of the invention can 
20 contain naturally occurring sequences, or sequences that 
differ from those that occur naturally, but, due to the 
degeneracy of the genetic code, encode the same 
polypeptide (for example, the polypeptide of SEQ 
ID NO: 2) . In addition', these nucleic acid molecules are 
25 not limited to sequences that only encode polypeptides, 
and thus, can include some or all of the non -coding 
sequences that lie upstream or downstream from the 
huchordin coding sequence . 

The nucleic acid molecules of the invention can be 
30 synthesized (for example, by phosphoramidite-based 

synthesis) or obtained from a biological cell (e.g., by 
cDNA cloning), such as the cell of a mammal. Thus, the 
nucleic acids can be those of a human, mouse, rat, guinea 
pig, cow, sheep, horse, pig, rabbit, monkey, dog, or cat. 
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Combinations or modifications of the nucleotides within 
these types of nucleic acids are also encompassed. 

In addition, the isolated nucleic acid molecules of 
the invention encompass fragments that are not found as 
5 such in the natural state. Thus, the invention 

encompasses recombinant molecules, such as those in which 
a nucleic acid molecule (for example, an isolated nucleic 
acid molecule encoding huchordin) is incorporated into a 
vector (for example, a plasmid or viral vector) or into 
10 the genome of a heterologous cell (or the genome of a 
homologous cell, at a position other than the natural 
chromosomal location) . Recombinant nucleic acid 
molecules and uses therefor are discussed further below. 

The invention encompasses peptide nucleic acids 
15 (PNA) and PNA-DNA chimeras having the sequence of a 

portion of the huchordin gene. DNA oligomers and PNA-DNA 
chimeric oligmers can be used for antisense inhibition 
(i.e., inhibition of translation) and anti-gene 
inhibition (i.e., inhibition of transcription) (Hyrup et 
20 al., Bioorganic & Medicinal Chem. 4:5, 1996; Finn et al . , 
Nucl. Acids Res. 24: 33357, 1996). PNA oligomer can also 
be used in DNA pre-gel hybridization as an alternative to 
Southern hybridization. 

In the event the nucleic acid molecules of the 
25 invention encode or act as antisense molecules, they can 
' be used for example, to regulate translation of huchordin 
mRNA. Techniques associated with the use of huchordin 
nucleic acid molecules for detection or regulation of 
huchordin expression can be used to diagnose and/or treat 
30 disorders associated with aberrant huchordin expression. 
These nucleic acid molecules are discussed further below 
in the context of their clinical utility. 

The invention encompasses single-stranded nucleic 
acid probes which hybridize to a huchordin nucleic acid 
35 molecule (e.g., the nucleic acid molecule of SEQ ID 
NO:l) . Such probes can be used diagnostic' methods to 




WO 99/15556 



PCT/US98/20675 



-13- 

detect mutations in the huchordin gene. For example, 
probes can be used to create a high density 
oligonucleotide probe array which can be used 
diagnostically to detect mutations and allelic variations 
5 in the huchordin gene (Cronin et al . , Human Mutation 
7:244, 1996) . 

Also within the invention are single-stranded 
nucleic acid primers which can be used to PCR amplify all 
or part of a huchordin -encoding nucleic acid molecule. 
io The invention also encompasses nucleic acid 

molecules that hybridize under stringent conditions to a 
nucleic acid molecule encoding a huchordin polypeptide . 
The protein encoding portion of the cDNA sequence 
described herein (SEQ ID NO:l) can be used to identify 
15 these nucleic acid molecules, which include, for example, 
nucleic acids that encode homologous polypeptides in 
other mammalian species, splice variants of the huchordin 
gene in humans or other mammals, and allelic variants of 
the huchordin gene or the genes encoding homologs of 
20 huchordin in other mammals (a naturally-occurring 
mammalian gene) . Further, genes may exist at other 
genetic loci within the genome that encode proteins which 
have extensive homology to huchordin polypeptides or one 
or more domains of huchordin polypeptides. Accordingly, 
25 the invention features methods of detecting and isolating 
these nucleic acid molecules. Using these methods, a 
sample (for example, a nucleic acid library, such as a 
cDNA or genomic library) is contacted (or "screened") 
with a huchordin-specific probe (for example, a fragment 
30 of SEQ ID NO : 1 that is at least 25 or 50 nucleotides 
long) . The probe will selectively hybridize to nucleic 
acids encoding related polypeptides (or to complementary 
sequences thereof) . The term "selectively hybridize" is 
used to refer to an event in which a probe binds to 
35 nucleic acids encoding huchordin (or to complementary 

sequences thereof) to a detectably greater extent than to 
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nucleic acids encoding Xenopus chordin. The probe, which 
can contain at least 25 (for example, 25, 50, 100, or 200 
nucleotides) can be produced using any of several 
standard methods (see, for example, Ausubel et 
5 al., "Current Protocols in Molecular Biology, Vol . I," 
Green Publishing Associates, Inc., and John Wiley & Sons, 
Inc., NY, 1989) . For example, the probe can be generated 
using PCR amplification methods in which oligonucleotide 
primers are used to amplify a huchordin- specific nucleic 
io acid sequence that can be used as a probe to screen a 
nucleic acid library and thereby detect nucleic acid 
molecules (within the library) that hybridize to the 
probe . 

One single- stranded nucleic acid is said to 
15 hybridize to another if a duplex forms between them. 

This occurs when one nucleic acid contains a sequence 
that is the reverse and complement of the other (this 
same arrangement gives rise to the natural interaction 
between the sense and antisense strands of DNA in the 
20 genome and underlies the configuration of the "double 
helix"). Complete complementarity between the 
hybridizing regions is not required in order for a duplex 
to form; it : is only necessary that the number of paired 
bases is sufficient to maintain the duplex under the 
25 hybridization conditions used. 

Typically, hybridization conditions are of low to 
moderate stringency. These conditions favor specific 
interactions between completely complementary sequences, 
but allow some non-specific interaction between less than 
30 perfectly matched sequences to occur as well. After 
hybridization, the nucleic acids can be "washed" under 
moderate or high conditions of stringency to dissociate 
duplexes that are bound together by some non-specific 
interaction (the nucleic acids that form these duplexes 
35 are thus not completely complementary) . 
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As is known in the art, the optimal conditions for 
washing are determined empirically, often by gradually 
increasing the stringency. The parameters that can be 
changed to affect stringency include, primarily, 
s temperature and salt concentration. In general, the 
lower the salt concentration and the higher the 
temperature, the higher the stringency. Washing can be 
initiated at a low temperature (for example, room 
temperature) using a solution containing a salt 
10 concentration that is equivalent to or lower than that of 
the hybridization solution. Subsequent washing can be 
carried out using progressively warmer solutions having 
the same salt concentration. As alternatives, the salt 
concentration can be lowered and the temperature 
15 maintained in the washing step, or the salt concentration 
can be lowered and the temperature increased. Additional 
parameters can also be altered. For example, use of a 
destabilizing agent, such as formamide, alters the 
stringency conditions. 

20 In reactions where nucleic acids are hybridized, the 

conditions used to achieve a given level of stringency 
will vary. There is not one set of conditions, for 
example, that will allow duplexes to form between all 
nucleic acids that are 05% identical to one another; 

25 hybridization also depends on unique features of each 
nucleic acid. The length of the sequence, the 
composition of the sequence (for example, the content of 
purine-like nucleotides versus the content of pyrimidine- 
like nucleotides) and the type of nucleic acid (for 
30 example, DNA or RNA) affect hybridization. An additional 
consideration is whether one of the nucleic acids is 
immobilized (for example, on a filter) . 

An example of a progression from lower to higher 
stringency conditions is the following, where the salt 
35 content is given as the relative abundance of SSC (a salt 
.solution containing sodium chloride and sodium citrate; 
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2X SSC is 10-fold more concentrated than 0.2X SSC) . 
Nucleic acids are hybridized at 42 °C in 2X SSC/0.1% SDS 
(sodium dodecylsulfate; a detergent) and then washed in 
0.2X SSC/0.1% SDS at room temperature (for conditions of 
s low stringency); 0.2X SSC/0.1% SDS at 42°C (for 

conditions of moderate stringency); and 0 . IX SSC at 68°C 
(for conditions of high stringency) . Washing can be 
carried out using only one of the conditions given, or 
each of the conditions can be used (for example, washing 
10 for 10-15 minutes each in the order listed above) . Any 
or all of the washes can be repeated. As mentioned 
above, optimal conditions will vary and can be determined 
empirically. 

A second set of conditions that are considered 
15 "stringent conditions" are those in which hybridization 
is carried out at 50°C in Church buffer (7% SDS, 

0.5% NaHP0 4 , 1 M EDTA, 1% BSA) and washing is carried out 
at 50°C in 2X SSC. 

As an alternative to screening a cDNA library, a 
20 human total genomic DNA library can be screened using 
huchordin probes. Huchordin - positive clones can then be 
sequenced and, further, the intron/exon structure of the 
human huchordin gene can be elucidated. Once genomic 
sequence is obtained, oligonucleotide primers can be 
25 designed based on the sequence for use in the isolation, 
via, for example. Reverse Transcriptase-coupled PCR, of 
huchordin splice variants. 

Further, a previously unknown gene sequence can be 
isolated by performing PCR using two degenerate 
30 oligonucleotide primer pools designed on the basis of 
nucleotide sequences within the huchordin cDNAs defined 
herein. The template for the reaction can be cDNA 
obtained by reverse transcription of mRNA prepared from 
human or non-human cell lines or tissue known or 
35 suspected to express a huchordin gene allele. The PCR 
product can be subcl.oned and sequenced to insure that the 
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amplif ied sequences represent the sequences of a 
huchordin - like gene nucleic acid sequence. 

The PCR fragment can then be used to isolate a full 
length cDNA clone by a variety of methods. For example, 

5 the amplified fragment can be labeled and used to screen 
a bacteriophage cDNA library. Alternatively, the labeled 
fragment can be used to screen a genomic library. 

PCR technology also can be used to isolate full 
length cDNA sequences. For example, RNA can be isolated, 
10 following standard procedures, from an appropriate 
cellular or tissue source. A reverse transcription 
reaction can be performed on the RNA using an 
oligonucleotide primer specific for the most 5' end of • 
the amplified fragment for the priming of first strand 
15 synthesis. The resulting RNA/DNA hybrid can then be 
"tailed" with guanines using a standard terminal 
transferase reaction, the hybrid can be digested with 
RNAase H, and second strand synthesis can then be primed 
with a poiy-C primer. Thus, cDNA sequences upstream of 
20 the amplified fragment can easily be isolated. For a 
review of useful cloning strategies, see e.g., Sambrook 
et al . , supra; and Ausubel et al . , supra. 

In cases where the gene identified is the normal 
(wild type) gene, this gene can be used to isolate mutant 
25 alleles of the gene. Such an isolation is preferable in 
processes and disorders which are known or suspected to 
have a genetic basis. 

A cDNA of a mutant gene can be isolated, for 
example, by using PCR, a technique which is well-known to 
30 one skilled in the art. In this case, the first cDNA 
strand can be synthesized by hybridizing a oligo-dT 
oligonucleotide to mRNA isolated from tissue known or 
suspected of being expressed in an individual putatively ' 
carrying the mutant allele, and by extending the new 
35 strand with reverse transcriptase. The second strand of 
the cDNA can then be synthesized using an oligonucleotide 




WO 99/15556 



PCT/US98/20675 



-18- 

that hybridizes specifically to the 5' -end of the normal 
gene. Using these two primers, the product is then 
amplified via PCR, cloned into a suitable vector, and 
subjected to DNA sequence analysis by methods well known 
5 in the art . By comparing the DNA sequence of the mutant 
gene to that of the normal gene, the mutation (s) 
responsible for the loss or alteration of function of the 
mutant gene product can be ascertained. 

Alternatively, a genomic or cDNA library can be 
10 constructed and screened using DNA or RNA, respectively, 
from a tissue known to or suspected of expressing the 
gene of interest in an individual suspected of or known 
to carry the mutant allele. The normal gene or any 
suitable fragment thereof can then be labeled and used as 
15 a probe to identify the corresponding mutant allele in 
the library. The clone containing this gene can then be 
purified through methods routinely practiced in the art, 
and subjected to sequence analysis using standard 
techniques as described herein. 

20 Additionally, an expression library can be 

constructed using DNA isolated from or cDNA synthesized 
from a tissue known to or suspected of expressing the 
gene of interest in an individual suspected of or known 
to carry the mutant allele. In this manner, gene 
25 products made by the putatively mutant tissue can be 

expressed and screened using standard antibody screening 
techniques in conjunction with antibodies raised against 
the normal gene product, as described herein. For 
screening techniques, see, for example, Harlow, E. and 
30 Lane, eds . , 1988, "Antibodies: A Laboratory Manual," 

Cold Spring Harbor Press, Cold Spring Harbor. 

In cases where the mutation results in an expressed 
gene product with altered function (e.g., as a result of 
a missense mutation) , a polyclonal set of antibodies is 
35 likely to cross -react with the mutant gene product. 
Library clones detected : via their reaction with such 
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labeled antibodies can be purified and subjected to 
sequence analysis as described herein. 

Once detected, the nucleic acid molecules can be 
isolated by any of a number of standard techniques (see, 

5 for example, Sambrook et al . , "Molecular Cloning, A 

Laboratory Manual," 2nd Ed. Cold Spring Harbor Laboratory 
Press, Cold Spring Harbor, NY, 1989) . 

The invention also encompasses: (a) expression 

vectors that contain any of the foregoing huchordin- 
10 related coding sequences and/or their complements (that 
is, "antisense" sequence) ; (b) expression vectors that 
contain any of the foregoing huchordin-related coding 
sequences operatively associated with a regulatory 
element (examples of which are given below) that directs 
is the expression of the coding sequences; (c) expression 
vectors containing, in addition to sequences encoding a 
huchordin polypeptide, nucleic acid sequences that are 
unrelated to nucleic acid sequences encoding huchordin, 
such as molecules encoding a reporter, a marker, or a 
20 portion of an immunoglobin; and (d) genetically 

engineered host cells that contain any of the foregoing 
expression vectors and thereby express the nucleic acid 
molecules of the invention in the host cell. 

Recombinant nucleic acid molecules can contain a 
25 sequence encoding a soluble huchordin polypeptide, mature 
huchordin (e.g., amino acids 27-867 of SEQ ID NO:2), or 
huchordin having a signal sequence. The full length 
huchordin polypeptide, a domain of huchordin, or a 
fragment thereof. may be fused to additional polypeptides, 
30 as described below. Similarly, the nucleic acid 

molecules of the invention can encode the mature form of 
huchordin or a form that encodes a polypeptide which 
facilitates secretion. In the latter instance, the 
polypeptide is typically referred to as a proprotein, 

35 which can be converted into an active form by removal of 
the signal sequence, for example, within the host cell. 
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The regulatory elements referred to above include, 
but are not limited to, inducible and non- inducible 
promoters, enhancers, operators and other elements, which 
are known to those skilled in the art, and which drive or 
5 otherwise regulate gene expression. Such regulatory 
elements include but are not limited to the 
cytomegalovirus hCMV immediate early gene, the early or 
late promoters of SV40 adenovirus, the lac system, the 
trp system, the TAC system, the TOC system, the major 
10 operator and promoter regions of phage A, the control 
regions of fd coat protein, the promoter for 
3-phosphoglycerate kinase, the promoters of acid 
phosphatase, and the promoters of the yeast a-mating 
factors . 

is Similarly, the nucleic acid can form part of a 

hybrid gene encoding additional polypeptide sequences, 
for example, sequences that function as a marker or 
reporter. Examples of marker or reporter genes include 
/?-lactamase, chloramphenicol acetyltransf erase (CAT) , 

20 adenosine deaminase (ADA) , aminoglycoside 

phosphotransferase (neo r , G418 r ) , dihydrof olate reductase 
(DHFR) , hvgromycin-B-phosphotransferase (HPH) , thymidine 
kinase (TK) , lacZ (encoding /S-galactosidase) , and 
xanthine guanine phosphoribosyl transferase (XGPRT) . As 
25 with many of the standard procedures associated with the 
practice of the invention, skilled artisans will be aware 
of additional useful reagents, for example, of additional 
sequences that can serve the function of a marker or 
reporter. Generally, the hybrid polypeptide will include 
30 a first portion and a second portion; the first portion 
being a huchordin polypeptide and the second portion 
being, for example, the reporter described above or an 
immunoglobulin constant region. 

The expression systems that may be used for purposes 
35 of the invention include, but are not limited to, 

microorganisms such as bacteria (for example, E. coli and 
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B. subtilis) transformed with recombinant bacteriophage 
DNA, plasmid DNA, or cosmid DNA expression vectors 
containing the nucleic acid molecules of the invention; 
yeast (for example, Saccharomyces and Pichia ) transformed 
5 with recombinant yeast expression vectors containing the 
nucleic acid molecules of the invention (preferably 
containing the nucleic acid sequence encoding huchordin 
(contained within SEQ ID NO:Y) ) ; insect cell systems 
infected with recombinant virus expression vectors (for 
10 example, baculovirus) containing the nucleic acid 

molecules of the invention; plant cell systems infected 
with recombinant virus expression vectors (for example, 
cauliflower mosaic virus (CaMV) and tobacco mosaic virus 
(TMV) ) or transformed with recombinant plasmid expression 
15 vectors (for example, Ti plasmid) containing huchordin 
nucleotide sequences; or mammalian cell systems (for 
example, COS, CHO, BHK, 293, VERO, HeLa, MDCK, WI38, and 
NIH 3T3 cells) harboring recombinant expression 
constructs containing promoters derived from the genome 
20 of mammalian cells (for example, the metallothionein 
promoter) or from mammalian viruses (for example, the 
adenovirus late promoter and the vaccinia virus 7.5K 
promoter) . 

In bacterial systems, a number of expression vectors 
25 may be advantageously selected depending upon the use 
intended for the gene product being expressed. For 
example, when a large quantity of such a protein is to be 
produced, for the generation of pharmaceutical 
compositions containing huchordin polypeptides or for 
30 raising antibodies to those polypeptides, vectors that 
are capable of directing the expression of high levels of 
fusion protein products that are readily purified may be 
desirable. Such vectors include, but are not limited to, 
the E. coli expression vector pUR278 (Ruther et al . , 

35 EMBO J. 2:1791, 1983), in which the coding sequence of 
the insert may be ligated individually into the vector in 
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frame with the lacZ coding region so that a fusion 
protein is produced; pIN vectors (Inouye and Inouye, 
Nucleic Acids Res. 13:3101-3109, 1985; Van Heeke and 
Schuster, J. Biol. Chem. 264:5503-5509, 1989); and the 
5 like. pGEX vectors may also be used to express foreign 
polypeptides as fusion proteins with glutathione 
S-transferase (GST) . In general, such fusion proteins 
are soluble and can easily be purified from lysed cells 
by adsorption to glutathione-agarose beads followed by 
10 elution in the presence of free glutathione. The pGEX 
vectors are designed to include thrombin or factor Xa 
protease cleavage sites so that the cloned target gene 
product can be released from the GST moiety. 

In an insect system, Autographs calif ornica nuclear 
15 polyhidrosis virus (AcNPV) can be used as a vector to 
express foreign genes. The virus grows in Spodoptera 
frugiperda cells. The coding sequence of the insert may 
be cloned individually into non-essential regions (for 
example the polyhedrin gene) of the virus and placed 
20 under control of an AcNPV promoter (for example the 

polyhedrin promoter) . Successful insertion of the coding 
sequence will result in inactivation of the polyhedrin 
gene and production of non-occluded recombinant virus 
(i.e., virus lacking the proteinaceous coat coded for by 
25 the polyhedrin gene) . These recombinant viruses are then 
used to infect Spodoptera frugiperda cells in which the 
inserted gene is expressed. (for example, see Smith 
et al . , J. Virol. 46:584, 1983; Smith, U.S. Patent 
No. 4,215,051) . 

30 In mammalian host cells, a number of viral -based 

expression systems may be utilized. In cases where an 
adenovirus is used as an expression vector, the nucleic 
acid molecule of the invention may be ligated to an 
adenovirus transcription/translation control complex, for 
35 example, the late promoter and tripartite leader 

sequence. This chimeric gene may then be inserted in the 
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adenovirus genome by in vitro or in vivo recombination. 
Insertion in a non-essential region of the viral genome 
(for example, region El or E3) will result in a 
recombinant virus that is viable and capable of 
5 expressing a huchordin gene product in infected hosts 

(for example, see Logan and Shenk, Proc. Natl. Acad. Sci . 
USA 81:3655-3659, 1984). 

Specific initiation signals may also be required for 
efficient translation of inserted nucleic acid molecules. 
io These signals include the ATG initiation codon and 
adjacent sequences. In cases where an entire gene or 
cDNA, including its own initiation codon and adjacent 
sequences, is inserted into the appropriate expression 
vector, no additional translational control signals may 
15 be needed. However, in cases where only a portion of the 
coding sequence is inserted, e.g., only the portion 
encoding the mature form of a secreted protein, exogenous 
translational control signals, including, perhaps, the 
ATG initiation codon, must be provided. Furthermore, the 
20 initiation codon must be in phase with the reading frame 
of the desired coding sequence to ensure translation of 
the entire insert. These exogenous translational control 
signals and initiation codons can be of a variety of 
origins, both natural and synthetic. The efficiency of 
25 expression may be enhanced by the inclusion of 
appropriate transcription enhancer elements, 
transcription terminators, etc. (see Bittner et al . , 
Methods in Enzymol . 153:516-544, 1987); 

In addition, a host cell strain may be chosen which 
30 modulates the expression of the inserted sequences, or 
modifies and processes the gene product in the specific 
fashion desired. Such modifications (for example, 
glycosylation) and processing (for example, cleavage) of 
protein products may be important for the function of the 
35 protein. Different host cells have characteristic and 
specific mechanisms for the post -translational processing 
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and modification of proteins and gene products . 
Appropriate cell lines or host systems can be chosen to 
ensure the correct modification and processing of the 
foreign protein expressed. To this end, eukaryotic host 
5 cells which possess the cellular machinery for proper 
processing of the primary transcript, glycosylation, and 
phosphorylation of the gene product may be used. The 
mammalian cell types listed above are among those that 
could serve as suitable host cells. 
id For long-term, high-yield production of recombinant 

proteins, stable expression is preferred. For example, 
cell lines which stably express the huchordin sequences 
described above may be engineered. Rather than using 
expression vectors which contain viral origins of 
15 replication, host cells can be transformed with DNA 
controlled by appropriate expression control elements 
(for example, promoter, enhancer sequences, transcription 
terminators, polyadenylation sites, etc.), and a 
selectable marker. Following the introduction of the 
20 foreign DNA, engineered cells may be allowed to grow for 
1-2 days in an enriched media, and then switched to a 
selective media. The selectable marker in the 
recombinant plasmid confers resistance to the selection 
and allows cells to stably integrate the plasmid into 
2 5 their- chromosomes and grow to form foci which in turn can 
be cloned and expanded into cell lines. This method can 
advantageously be used to engineer cell lines which 
express huchordin. Such engineered cell lines may be 
particularly useful in screening and evaluation of 
30 compounds that affect the endogenous activity of the gene 
product . 

A number of selection systems can be used. For 
example, the herpes simplex virus thymidine kinase 
(Wigler , et al . , Cell 11:223, 1977), hypoxanthine -guanine 
35 phosphoribosyltransferase (Szybalska and Szybalski, Proc. 
Natl. . Acad. Sci . USA 48:2026, 1962), and adenine 
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phosphoribosyl transferase (Lowy, et al . , Cell 22:817, 
1980) genes can be employed in tk', hgprt' or aprf cells, 
respectively. Also, anti -metabolite resistance can be 
used as the basis of selection for the following genes : 

5 dhfr, which confers resistance to methotrexate (Wigler 
et al., Proc. Natl. Acad. Sci . USA 77:3567, 1980; O'Hare 
et al., Proc. Natl. Acad. Sci. USA 78:1527, 1981); gpt, 
which confers resistance to mycophenolic acid (Mulligan 
and Berg, Proc. Natl. Acad. Sci. USA 78:2072, 1981); neo, 
10 which confers resistance to the aminoglycoside G-418 

(Colberre-Garapin et al . , J. Mol. Biol. 150:1, 1981); and 
hygro, which confers resistance to hygromycin (Santerre 
et al.. Gene 30:147, 1984). 

Huchordin nucleic acid molecules are useful for 
15 diagnosis of disorders associated with aberrant 
expression of huchordin. Huchordin nucleic acid 
molecules are also useful in genetic mapping and 
chromosome identification. 

Huchordin Polypeptides 

20 The huchordin polypeptides described herein are 

those encoded by any of the nucleic acid molecules 
described above and include huchordin fragments, mutants, 
truncated forms, and fusion proteins. These polypeptides 
can be prepared for a variety of uses, including but not 
25 limited to the generation of antibodies, as reagents in 
diagnostic assays, for the identification of other 
cellular gene products or compounds that can modulate the 
activity or expression of huchordin, and as 
pharmaceutical reagents useful for the treatment of 
30 disorders associated with aberrant expression or activity 
of huchordin. 

Preferred polypeptides are substantially pure 
huchordin polypeptides, including those that correspond 
to the polypeptide with and without intact signal 
35 sequence Especially preferred are huchordin polypeptides 
that are soluble under normal physiological conditions. 
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The invention also encompasses polypeptides that are 
functionally equivalent to huchordin. These polypeptides 
are equivalent to huchordin in that they are capable of 
carrying out one or more of the functions of huchordin in 
5 a biological system. Preferred huchordin polypeptides 
have 20%, 40%, 50%, 75%, 80%, or even 90% of one or more 
of the biological activities of the full-length, mature 
human form of huchordin. Such comparisons are generally 
based on an assay of biological activity in which equal 
10 concentrations of the polypeptides are used and compared. 
The comparison can also be based on the amount of the 
polypeptide required to reach 50% of the maximal 
stimulation obtainable. 

Functionally equivalent proteins can be those, for 
15 example, that contain additional or substituted amino 

acid residues. Substitutions may be made on the basis of 
similarity in polarity, charge, solubility, 
hydrophobicity, hydrophilicity, and/or the amphipathic 
nature of the residues involved. Amino acids that are 
20 typically considered to provide a conservative 

substitution for one another are specified in the summary 
of the invention.. 

Polypeptides that are functionally equivalent to 
huchordin can be made using random mutagenesis techniques 
25 well known to those skilled in the art (and the resulting 
mutant huchordin proteins can be tested for activity) . 

It is more likely, however, that such polypeptides will 
be generated by site-directed mutagenesis (again using 
techniques well known to those skilled in the art). 

30 These polypeptides may have increased functionality or 
decreased functionality. 

To design functionally equivalent polypeptides, it 
is useful to distinguish between conserved positions and 
variable positions. This can be done by aligning the 
35 sequence of huchordin cDNAs that were obtained from 
various organisms. Conserved resides can also be 
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identified by aligning motifs within huchordin. For 
example, the cys residues of the cys-rich repeats are 
conserved residues . Skilled artisans will recognize that 
conserved amino acid residues are more likely to be 
5 necessary for preservation of function. Thus, it is 
• preferable that conserved residues are not altered. 

Amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, 
hydrophobicity , hydrophilicity, and/or the amphipathic 
10 nature of the residues involved. For example, nonpolar 
(hydrophobic) amino acids include alanine, leucine, 
isoleucine, valine, proline, phenylalanine, tryptophan, 
and methionine; polar neutral amino acids include 
glycine, serine, threonine, cysteine, tyrosine, 
is asparagine, and glutamine; positively charged (basic) 

amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic 
acid and glutamic acid. 

Mutations within the huchordin coding sequence can 
20 be made to generate variant huchordin genes that are 
better suited for expression in a selected host cell. 

For example, N-linked glycosylation sites can be altered 
or eliminated to achieve, for example, expression of a 
homogeneous product that is more easily recovered and 
25 purified from yeast hosts which are known to 

hyperglycosylate N-linked sites. To this end, a variety 
of amino acid substitutions at one or both of the first 
or third amino acid positions of any one or more of the 
glycosylation recognition sequences which occur (in N-X-S 
30 or N-X--), and/or an amino acid deletion at the second 
position cf any one or more of such recognition 
sequences, will prevent glycosylation at the modified 
tripeptide sequence (see, for example, Miyajima et al . , 
EMBO J. 5:1193, 1986) . 

The polypeptides of the invention can be expressed 
fused to another polypeptide, for example, a marker 



35 
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polypeptide or fusion partner. Alternatively, a fusion 
protein may be readily purified by utilizing an antibody 
specific for the fusion protein being expressed. For 
example, a system described by Janknecht et al . allows 
5 for the ready purification of non-denatured fusion 

proteins expressed in human cell lines ( Proc . Natl. Acad. 
Sci . USA 88: 8972-8976, 1991) . In this system, the gene 
of interest is subcloned into a vaccinia recombination 
plasmid such that the gene's open reading frame is 
10 translationally fused to an amino-terminal tag consisting 
of six histidine residues. Extracts from cells infected 
with recombinant vaccinia virus are loaded onto 
Ni 2 ’ • nitriloacetic acid-agarose columns and histidine- 
tagged proteins are selectively eluted with imidazole- 
15 containing buffers. 

The polypeptides of the invention can be chemically 
synthesized (for example, see Creighton, "Proteins: 
Structures and Molecular Principles," W.H. Freeman & Co., 
NY, 1983), or, perhaps more advantageously, produced by 
20 recombinant DNA technology as described herein. For 

additional guidance, skilled artisans may consult Ausubel 
et al. (supra), Sambrook et al . ("Molecular Cloning, A 
Laboratory Manual," Cold Spring Harbor Press, Cold Spring 
Harbor, NY, 1989) , and, particularly for examples of 
25 chemical synthesis Gait, M.J. Ed. ("Oligonucleotide 
Synthesis," IRL Press, Oxford, 1984). 

Once the recombinant huchordin protein is expressed, 
it is isolated. Secreted forms can be isolated from the 
culture media, while non-secreted forms must be isolated 
30 from the host cells. Proteins can be isolated by 
affinity chromatography. In one example, an anti- 
huchordin protein antibody (e.g., produced as described 
herein) is attached to a column and used to isolate the 
huchordin protein. Lysis and fractionation of huchordin 
35 protein-harboring cells prior to affinity chromatography 
can be performed by standard methods (see, e.g., Ausubel 
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et al., supra). Alternatively, a huchordin fusion 
protein, for example, a huchordin-maltose binding 
protein, a huchordin-/3-galactosidase, or a huchordin- trpE 
fusion protein, can be constructed and used for huchordin 
5 protein isolation (see, e.g., Ausubel et al . , supra; New 
England Biolabs, Beverly, MA) . 

Once isolated, the recombinant protein can, if 
desired, be further purified, e.g., by high performance 
liquid chromatography using standard techniques (see, 

10 e.g., Fisher, Laboratory Techniques In Biochemistry And 
Molecular Biology, eds . , Work and Burdon, Elsevier, 

198 0) . 

The invention also features polypeptides that 
interact with huchordin (and the genes that encode them) 
is and thereby alter the function of huchordin. Interacting 
polypeptides can be identified using methods known to 
those skilled in the art. One suitable method is the 
"two-hybrid system," which detects protein interactions 
in vivo (Chien et al., Proc. Natl. Acad. Sci . USA, ^ 

20 88:9578, 1991). A kit for practicing this method is 

available from Clontech (Palo Alto, CA) . 

Transgenic animals 

Huchordin polypeptides can also.be expressed in 
transgenic animals. These animals represent a model 
25 system for the study of disorders that are caused by or 
exacerbated by overexpression or underexpression of 
huchordin, and for the development of therapeutic agents 
that modulate the expression or activity of huchordin. 

Transgenic animals can be farm animals (pigs, goats, 
30 sheep, cows, horses, rabbits, and the like) rodents (such 
as rats, guinea pigs, and mice) , non-human primates (for 
example, baboons, monkeys, and chimpanzees) , and domestic 
animals (for example, dogs and cats) . Transgenic mice 
are ; especially preferred. 

Any technique known in the art can be used to 
introduce a huchordin transgene into animals to produce 
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the founder lines of transgenic animals. Such techniques 
include, but are not limited to, pronuclear 
microinjection (U.S. Pat. No. 4,873,191); retrovirus 
mediated gene transfer into germ lines (Van der Putten 
5 et al., Proc. Natl. Acad. Sci., USA 82:6148, 1985); gene 
targeting into embryonic stem cells (Thompson et al., 

Cell 56:313, 1989); and electroporation of embryos (Lo, 
Mol. Cell. Biol. 3:1803, 1983). 

The present invention provides for transgenic 
io animals that carry a the huchordin transgene in all their 
cells, as well as animals that carry a transgene in some, 
but not all of their cells. That is, the invention 
provides for mosaic animals. The transgene can be 
integrated as a single transgene or in concatamers, e.g., 
15 head-to-head tandems or head-to-tail tandems. The 
transgene can also be selectively introduced into and 
activated in a particular cell type (Lasko et al . , Proc. 
Natl. Acad. Sci. USA 89:6232, 1992). The regulatory 
sequences required for such a cell-type specific 
20 activation will depend upon the particular cell type of 
interest, and .will be apparent to those of skill in the 
art . 

When it is desired that the huchordin transgene be 
integrated into the chromosomal site of the endogenous 
25 huchordin the endogenous, gene targeting is preferred. 
Briefly, when such a technique is to be used, vectors 
containing some nucleotide sequences homologous to an 
endogenous huchordin gene are designed for the purpose of 
integrating, via homologous recombination with 
30 chromosomal sequences, into and disrupting the function 
of the nucleotide sequence of the endogenous gene. The 
transgene also can be selectively introduced into a 
particular cell type, thus inactivating the endogenous 
huchordin gene in only that cell type (Gu et al . , Science 
35 265:103, 1984). The regulatory sequences required for 
such a cell -type specific inactivation will depend upon 
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the particular cell type of interest, and will be 
apparent to those of skill in the art. These techniques 
are useful for preparing "knock outs" having no 
functional huchordin gene. 

s Once transgenic animals have been generated, the 

• expression of the recombinant huchordin gene can be 
assayed utilizing standard techniques. Initial screening 
may be. accomplished by Southern blot analysis or PCR 
techniques to determine whether integration of the 
10 transgene has taken place. The level of mRNA expression 
of the transgene in the tissues of the transgenic animals 
may also be assessed using techniques which include, but 
are not limited to, Northern blot analysis of tissue 
samples obtained from the animal, in situ hybridization 
15 analysis, and RT-PCR. Samples of huchordin gene- 
expressing tissue can also be evaluated 
immunocytochemically using antibodies specific for the 
huchordin transgene product. 

For a review of techniques that can be used to 
20 generate and assess transgenic animals, skilled artisans 
can consult Gordon [Inti. Rev. Cytol. 115:171-229, 1989), 
and may obtain additional guidance from, for example: 
Hogan et al . , "Manipulating the Mouse Embryo," Cold 
Spring Harbor Press, Cold Spring Harbor, NY, 1986; 

25 Krimpenfort et al . , Bio /Technology 9 -.B6 , 1991; Palmiter 
et al . , Cell 41:343, 1985; Kraemer et al . , "Genetic 
Manipulation of the Early Mammalian -Embryo, " Cold Spring 
Harbor Press, Cold Spring Harbor, NY, 1985; Hammer 
et al . , Nature 315:680, 1985; Purcel et al . , Science, 

30 244:1281, 1986; Wagner et al . , U.S. Patent No. 5,175,385; 

and Krimpenfort et al . , U.S. Patent No. 5,175,384 (the 
latter two publications are hereby incorporated by 
reference) . ’ 

Anti -huchordin Antibodies 

Huchordin polypeptides (or immunogenic fragments or 
analogs) can be used to raise antibodies useful in the 
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invention; such polypeptides can be produced by 
recombinant techniques or synthesized (see, for example, 
"Solid Phase Peptide Synthesis," supra; Ausubel et al., 
supra) . In general, the peptides can be coupled to a 
5 carrier protein, such as KLH, as described in Ausubel 
et al., supra, mixed with an adjuvant, and injected into 
a host mammal. Antibodies can be purified by peptide 
antigen affinity chromatography. 

In particular, various host animals can be immunized 
10 by injection with a huchordin protein or polypeptide. 

Host animals include rabbits, mice, guinea pigs, and 
rats. Various adjuvants that can be used to increase the 
immunological response depend on the host species and 
include Freund's adjuvant (complete and incomplete), 
is mineral gels such as aluminum hydroxide, surface active 
substances such as lysolecithin, pluronic polyols, 
polyanions, peptides, oil emulsions, keyhole limpet 
hemocyanin, and dinitrophenol . Potentially useful human 
adjuvants include BCG (bacille Calmette-Guerin) and 
20 Coryn eba cterium parvum. Polyclonal antibodies are 

heterogeneous populations of antibody molecules that are 
contained in the sera of the immunized animals. 

Antibodies within the invention therefore include 
polyclonal antibodies and, in addition, monoclonal 
25 antibodies, humanized or chimeric antibodies, single 
chain antibodies, Fab fragments, F(ab') 2 fragments, and 
molecules produced using a Fab expression library. 

Monoclonal antibodies, which are homogeneous 
populations of antibodies to a particular antigen, can be 
30 prepared using the huchordin polypeptides described above 
and standard hybridoma technology (see, for. example, 
Kohler et al . , Nature 256:495, 1975; Kohler et al . , Bur. 
J. Immunol. 6:511, 1976; Kohler et al . , Eur. J. Immunol. 
6:292, 1976; Hammerling et al . , "Monoclonal Antibodies 
35 and T Cell Hybridomas," Elsevier, NY, 1981; Ausubel 
et al . , supra ) . 
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In particular, monoclonal antibodies can be obtained 
by any technique that provides for the production of 
antibody molecules by continuous cell lines in culture 
such as described in Kohler et al . , Nature 256:495, 1975, 
5 and U.S. Patent No. 4,376,110; the human B-cell hybridoma 
technique (Kosbor et al . , Immunology Today 4:72, 1983; 
Cole et al., Proc . Natl. Acad . Sci . USA 80:2026, 1983), 
and the EBV-hybridoma technique (Cole et al . , "Monoclonal 
Antibodies and Cancer Therapy," Alan R. Liss, Inc., pp. 

10 77-96, 1983) . Such antibodies can be of any 

immunoglobulin class including IgG, IgM, IgE, IgA, IgD 
and any subclass thereof. The hybridoma producing the 
mAb of this invention may be cultivated in vitro or 
in vivo . The ability to produce high titers of mAbs 
15 in vivo makes this a particularly useful method of 
production. 

Once produced, polyclonal or monoclonal antibodies 
are tested for specific huchordin recognition by Western 
blot or immunoprecipitation analysis by standard methods, 
20 e.g., as described in Ausubel et al . , supra . Antibodies 
that specifically recognize and bind to huchordin are 
useful in the invention. For example, such antibodies 
can be used in an immunoassay to monitor the level of 
huchordin produced by a mammal (for example, to determine 
25 the amount or subcellular location of huchordin) . 

Preferably, antibodies of the invention are produced 
using fragments of the huchordin protein which lie 
outside highly conserved region^ and appear likely to be 
antigenic, by criteria such as high frequency of charged 
30 residues. In one specific example, such fragments are 
generated by standard techniques of PCR, and are then 
cloned into the pGEX expression vector (Ausubel et al . , 
supra) . Fusion proteins are expressed in E. coli and 
purified using a glutathione agarose affinity matrix as 
35 described in Ausubel, et al . , supra . 
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Antisera is also checked for its ability to 
immunoprecipitate recombinant huchordin proteins or 
control proteins, such as glucocorticoid receptor, CAT, 
or lucif erase. 

5 The antibodies can be used, for example, in the 

detection of the huchordin in a biological sample as part 
of a diagnostic assay. Antibodies also can be used in a 
screening assay to measure the effect of a candidate 
compound on expression or localization of huchordin. 

10 Additionally, such antibodies can be used in conjunction 
with the gene therapy techniques to, for example, 
evaluate the normal and/or engineered huchordin- 
expressing cells prior to their introduction into the 
patient. Such antibodies additionally can be used in a 
15 method for inhibiting abnormal huchordin activity. 

In addition, techniques developed for the production 
of "chimeric antibodies" (Morrison et al . , Proc . Natl. 
Acad. Sci. USA, 81:6851, 1984; Neuberger et al.', Nature, 
312:604, 1984; Takeda et al., Nature, 314:452, 1984) by 
20 splicing the genes from a mouse antibody molecule of 

appropriate antigen specificity together with genes from 
a human antibody molecule of appropriate biological 
activity can be used. A chimeric antibody is a molecule 
in which different portions are derived from different 
25 animal species, such as those having a variable region 
derived from a murine mAb and a human immunoglobulin 
constant region. 

Alternatively, techniques described for the 
production of single chain antibodies (U.S. Patent Nos. 

30 4,946,778, 4,946,778, and 4,704,692) can be adapted to 

produce single chain antibodies against a huchordin 
protein or polypeptide. Single chain antibodies are 
formed by linking the heavy and light chain fragments of 
the Fv region via an amino acid bridge, resulting in a 
35 single chain polypeptide. 
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Antibody fragments that recognize and bind to 
specific epitopes can be generated by known techniques. 
For example, such fragments include but are not limited 
to F (ab' ) 2 fragments that can be produced by pepsin 
5 digestion of the antibody molecule, and Fab fragments 
that can be generated by reducing the disulfide bridges 
of F (ab' ) 2 fragments. Alternatively, Fab expression 
libraries can be constructed (Huse et al . , Science , 
246:1275, 1989) to allow rapid and easy identification of 
10 monoclonal Fab fragments with the desired specificity. 

Antibodies to huchordin can, in turn, be used to 
generate anti-idiotype antibodies that resemble a portion 
of huchordin using techniques well known to those skilled 
in the art (see, e.g., Greenspan et al., FASEB J. 7:437, 

15 1993; Nissinoff, J. Immunol. 147:2429, 1991). For 

example, antibodies that bind to huchordin and 
competitively inhibit the binding of a binding partner of 
huchordin can be used to generate anti-idiotypic 
antibodies that resemble a binding partner binding domain 
20 of huchordin and, therefore, bind and neutralize a 

binding partner of huchordin. Such neutralizing anti- 
idiotypic antibodies or Fab fragments of such anti- 
idiotypic antibodies can be used in therapeutic regimens. 

Antibodies can be humanized by methods known in the 
25 art. For example, monoclonal antibodies with a desired 
binding specificity can be commercially humanized 
(Scotgene, Scotland; Oxford Molecular, Palo Alto, CA) . 
Fully human antibodies, such as those expressed in 
transgenic animals are also features of the invention 
30 (Green et al . , Nature Genetics 7:13-21, 1994; see also 
U.S. Patents 5,545,806 and 5,569,825, both of which are 
hereby incorporated by reference) . 

The methods described herein in which anti -huchordin 
antibodies are employed may be performed, for example, by 
35 utilizing pre-packaged diagnostic kits comprising at 
least one specific huchordin nucleotide sequence or 
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antibody reagent described herein, which may be 
conveniently used, for example, in clinical settings, to 
diagnose patients exhibiting symptoms of the disorders 
described below. 

5 Antisense Nucleic Acids 

Treatment regimes based on an "antisense" approach 
involve the design of oligonucleotides (either DNA or 
RNA) that are complementary to huchordin mRNA. These 
oligonucleotides bind to the complementary huchordin mRNA 
10 transcripts and prevent translation. Absolute 

complementarity, although preferred, is not required. A 
sequence "complementary" to a portion of an RNA, as 
referred to herein, means a sequence having sufficient 
complementarily to be able to hybridize with the RNA, 
is forming a stable duplex; in the case of double -stranded 
antisense nucleic acids, a single strand of the duplex 
DNA may be tested, or triplex formation may be assayed. 
The ability to hybridize will depend on both the degree 
of complementarily and the length of the antisense 
20 nucleic acid. Generally, the longer the hybridizing 

nucleic acid, the more base mismatches with an RNA it may 
contain and still form a stable duplex (or triplex, as 
the case may be) . One skilled in the art can ascertain a 
tolerable degree of mismatch by use of standard 
25 procedures to determine the melting point of the 
hybridized complex. 

Oligonucleotides that are complementary to the 
5' end of the message, e.g., the 5' untranslated sequence 
up to and including the AUG initiation codon, should work 
30 most efficiently at inhibiting translation. However, 
sequences complementary to the 3' untranslated sequences 
of mRNAs recently have been shown to be effective at 
inhibiting translation of mRNAs as well (Wagner, Nature 
372:333, 1984). Thus, oligonucleotides complementary to 
35 either the 5' or 3 ' non-translated, non-coding regions of 
the . huchordin gene, e.g., the human gene shown in Pig. l 
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could be used in an antisense approach to inhibit 
translation of endogenous huchordin mRNA. 

Oligonucleotides complementary to the 5' untranslated 
region of the mRNA should include the complement of the 
5 AUG start codon. 

Antisense oligonucleotides complementary to mRNA 
coding regions are less efficient inhibitors of 
translation but could be used in accordance with the 
invention. Whether designed to hybridize to the 
10 5', 3', or coding region of huchordin mRNA, antisense 

nucleic acids should be at least six nucleotides in 
length, and are preferably oligonucleotides ranging from 
6 to about 5 0 nucleotides in length.. In specific aspects 
the oligonucleotide is at least 10 nucleotides, at least 
15 17 nucleotides, at least 25 nucleotides, or at least 

50 nucleotides. 

Regardless of the choice of target sequence, it is 
preferred that in vitro studies are first performed to 
quantitate the ability of the antisense oligonucleotide 
20 to inhibit gene expression. It is preferred that these 
studies utilize controls that distinguish between 
antisense gene inhibition and nonspecific biological 
effects of oligonucleotides. It is also preferred that 
these studies compare levels of the target RNA or protein 
25 with that of an internal control RNA or protein. 

Additionally, it is envisioned that results obtained 
using the antisense oligoAucleotide are compared with 
those obtained using a control oligonucleotide. It is 
preferred that the control oligonucleotide is of 
30 approximately the same length as the test oligonucleotide 
and that the nucleotide sequence of the oligonucleotide 
differs from the antisense sequence no more than is 
necessary to prevent specific hybridization to the target 
sequence . 

The oligonucleotides can be DNA or RNA or PNA or 
chimeric mixtures or derivatives or modified versions 
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thereof, single -stranded or double -stranded. The 
oligonucleotide can be modified at the base moiety, sugar 
moiety, or phosphate backbone, for example, to improve 
stability of the molecule, hybridization, etc. The 
5 oligonucleotide may include other appended groups such as 
peptides (e.g,, for targeting host cell receptors 
in vivo) , or agents facilitating transport across the 
cell membrane (as described, e.g., in Letsinger et al . , 
Proc. Natl. Acad. Sci. USA 86:6553, 1989; Lemaitre 
10 et al., Proc. Natl. Acad. Sci. USA 84:648, 1987; PCT 
Publication No. WO 88/09810) or the blood-brain barrier 
(see, for example, PCT Publication No. WO 89/10134), or 
hybridization- triggered cleavage agents (see, for 
example, Krol et al . , BioTechniques 6:958, 1988), or 
15 intercalating agents (see, for example, Zon, Pharm. Res. 
5:539, 1988). To this end, the oligonucleotide can be 
conjugated to another molecule, e.g., a peptide, 
hybridization triggered cross-linking agent, transport 
agent, or hybridization- triggered cleavage agent. 

20 The antisense oligonucleotide may comprise at least 

one modified base moiety which is selected from the group 
including, but not limited to, 5-fluorouracil, 
5-bromouracil, 5-chlorouracil , 5-iodouracil , 
hypoxanthine , xantine, 4 -acetylcytosine , 5- 
25 (carboxyhydroxylmethyl) uracil, 5- 

carboxymethylaminomethyl -2 - thiouridine , 5 - carboxymethyl - 
aminomethyluracil , dihydrouracil, beta-D- 
galactosylqueosine, inosine, N6-isopentenyl adenine, 

1- methyl guanine, 1-methylinosine, 2, 2-dimethylguanine, 

30 2 -methyl adenine, 2 -methyl guanine, 3 -methyl cytosine, 

5 -methyl cytosine, N6- adenine, 7-methylguanine, 

5- methyl aminomethyluracil, 5-methoxyaminomethyl- 

2- thiouracil , beta-D-mannosylqueosine, 

5' -methoxycarboxymethyluracil, 5-methoxyuracil , 

35 2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic 
acid (v) , wybutoxosine, pseudouracil, queosine. 
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2-thiocytosine, 5-methyl -2-theouracil , 2 -thiouracil , 4- 
thiouracil, 5-methyluracil, uracil-5-oxyacetic acid 
methylester, uracil-5-oxyacetic acid (v) , 5-methyl - 
2 -thiouracil , 2- (3-amino-3-N-2-carboxypropl) uracil, 

5 (acp3)w, and 2 , 6-diaminopurine . 

The antisense oligonucleotide may also comprise at 
least one modified sugar moiety selected from the group 
including, but not limited to, arabinose, 
2-fluoroarabinose, xylulose, and hexose. 
io In yet another embodiment, the antisense 

oligonucleotide comprises at least one modified phosphate 
backbone selected from the group consisting of a 
phosphorothioate, a phosphorodithioate, a 
phosphoramidothioate, a phosphoramidate, a 
is phosphordiamidate, a methylphosphonate, an alkyl 

phosphotriester, and a formacetal, or an analog of any of 
these backbones . 

In yet another embodiment, the antisense 
oligonucleotide is an a-anomeric oligonucleotide. An 
20 a-anomeric oligonucleotide forms specific double -stranded 
hybrids with complementary RNA in which, contrary to the 
usual /3-units, the strands run parallel to each other 
(Gautier et al . , Nucl. Acids. Res. 15:6625, 1987). The 
oligonucleotide is a 2 ' -O-methylribonucleotide (Inoue 
25 et al., Nucl. Acids Res. 15‘:6131, 1987), or a chimeric 
RNA-DNA analog (Inoue et al . , FEBS Lett. 215:327, 1987). 

Peptide nucleic acid (PNA) oligonucleotides can be 
used as antisense molecules (Hyrup et al . , Bioorganic & 
Medicinal Chem. 4:5, 1996). 

30 Antisense oligonucleotides of the invention can be 

synthesized by standard methods known in the art, e.g., 
by- use of an automated DNA synthesizer (such as are 
commercially available from Biosearch, Applied 
Biosystems, etc.). As examples, phosphorothioate 
35 oligonucleotides can be synthesized by the method of 
Stein et al . (Nucl. Acids Res. 16:3209, 1988), and 
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methylphosphonate oligonucleotides can be prepared by use 
of controlled pore glass polymer supports (Sarin et al . , 
Proc. Natl. Acad. Sci . USA 85 : 7448 , 1988 ). 

While antisense nucleotides complementary to the 
5 huchordin coding region sequence could be used, those 
complementary to the transcribed untranslated region are 
most preferred. 

The antisense molecules should be delivered to cells 
that express huchordin in vivo. A number of methods have 
10 been developed for delivering antisense DNA or RNA to 

cells; e.g., antisense molecules can be injected directly 
into the tissue site, or modified antisense molecules, 
designed to target the desired cells (e.g., antisense 
linked to peptides or antibodies that specifically bind 
15 receptors or antigens expressed on the target cell 
surface) can be administered systemically . 

However, it is often difficult to achieve 
intracellular concentrations of the antisense molecule 
sufficient to suppress translation of endogenous mRNAs. 

20 Therefore, a preferred approach uses a recombinant DNA 
construct in which the antisense oligonucleotide is 
placed under the control of a strong pol III or pol II 
promoter. The use of such a construct to transfect 
target cells in the patient will result in the 
25 transcription of sufficient amounts of single stranded 
RNAs that will form complementary base pairs with the 
endogenous huchordin transcripts and thereby prevent 
translation of the huchordin mRNA. For example, a vector 
can be introduced in vivo such that it is taken up by a 
30 cell and directs the transcription of an antisense RNA. 
Such a vector can remain episomal or become chromosomally 
integrated, as long as it can be transcribed to produce 
the desired antisense RNA. 

Such vectors can be constructed by recombinant DNA 
35 technology methods standard in the art. Vectors can be 
plasmid, viral, or others known in the art, used for 
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replication and expression in mammalian cells. 

Expression of the sequence encoding the antisense RNA can 
be by any promoter known in the art to act in mammalian, 
preferably human cells. Such promoters can be inducible 
5 or constitutive. Such promoters include, but are not 
limited to: the SV40 early promoter region (Bernoist 

et al . , Nature 290:304, 1981); the promoter contained in 
the 3' long terminal repeat of Rous sarcoma virus 
(Yamamoto et al., Cell 22:787-797, 1988); the herpes 
10 thymidine kinase promoter (Wagner et al . , Proc. Natl. 
Acad. Sci . USA 78:1441, 1981); or the regulatory 
sequences of the metallothionein gene (Brinster et al . , 
Nature 296:39, 1988) . 

Ribozvmes 

15 Ribozyme molecules designed to catalytically cleave 

huchordin mRNA transcripts also can be used to prevent 
translation of huchordin mRNA and expression of huchordin 
(see, e.g., PCT Publication WO 90/11364; Saraver et al . , 
Science 247:1222, 1990) . While various ribozymes that 
20 cleave mRNA at site-specific recognition sequences can be 
used to destroy huchordin mRNAs, the use of hammerhead 
ribozymes is preferred. Hammerhead ribozymes cleave 
mRNAs at locations dictated by flanking regions that form 
complementary base pairs with the target mRNA. The sole 
25 requirement is that the target mRNA have the following 
sequence of two bases: 5'-UG-3'. The construction and 

production of hammerhead ribozymes is well known in the 
art (Haselcff et al . , Nature 334:585, 1988). There are 
numerous examples of potential hammerhead ribozyme 
30 cleavage sites within the nucleotide sequence of human 
huchordin cDNA . Preferably, the ribozyme is engineered 
so that the cleavage recognition site is located near the 
5 ' end of the huchordin mRNA, i . e . , to increase 
efficiency and minimize the intracellular accumulation of 
35 non- functional mRNA transcripts. 
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The ribozymes of the present invention also include 
RNA endoribonucleases (hereinafter "Cech-type 
ribozymes"), such as the one that occurs naturally in 
Tetrahymena Thermophila (known as the IVS or L-19 IVS 
5 RNA) , and which has been extensively described by Cech 
and his collaborators (Zaug et al. , Science 224:574, 

1934; Zaug et al., Science, 231:470, 1986; Zug et al.. 
Nature 324:429, 1986; PCT Application No. WO 88/04300; 
and Been et al . , Cell 47:207, 1986). The Cech-type 
10 ribozymes have an eight base-pair sequence that 

hybridizes to a target RNA sequence, whereafter cleavage 
of the target RNA takes place. The invention encompasses 
those Cech-type ribozymes that target eight base-pair 
active site sequences present in huchordin. 
is As in the antisense approach, the ribozymes can be 

composed of modified oligonucleotides (e.g., for improved 
stability, targeting, etc.), and should be delivered to 
cells which express the huchordin in vivo. A preferred 
method of delivery involves using a DNA construct 
20 "encoding" the ribozyme under the control of a strong 
constitutive pol III or pol II promoter, so that 
transfected cells will produce sufficient quantities of 
the ribozyme to destroy endogenous huchordin messages and 
inhibit translation. Because ribozymes, unlike antisense 
25 molecules, are catalytic, a lower intracellular 
concentration is required for efficiency. 

Methods for Reducing Huchordin Expression 

A variety of methods can be used to reduce huchordin 
expression. For example, the antisense techniques 
30 described above can be used to reduce huchordin 
expression. 

Endogenous huchordin gene expression can also be 
reduced by inactivating or "knocking out" the huchordin 
gene or its promoter using targeted homologous 
recombination (see, e.g., U.S. Patent No. 5,464,764). 

For example, a mutant, non- functional huchordin (or a 
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completely unrelated DNA sequence) flanked by DNA 
homologous to the endogenous huchordin gene (either the 
coding regions or regulatory regions of the huchordin 
gene) can be used, with or without a selectable marker 
5 and/or a negative selectable marker, to transfect cells 
that express huchordin in vivo. Insertion of the DNA 
construct, via targeted homologous recombination, results 
in inactivation of the huchordin gene. Such approaches 
are particularly suited for use in the agricultural field 
10 where modifications to ES (embryonic stem) cells can be 
used to generate animal offspring with an inactive 
huchordin. However, this approach can be adapted for use 
in humans, provided the recombinant DNA constructs are 
directly administered or targeted to the required site 
15 in vivo using appropriate viral vectors. 

Alternatively, endogenous huchordin gene expression 
can be reduced using deoxyribonucleotide sequences 
complementary to the regulatory region of the huchordin 
gene (i.e., the huchordin promoter and/or enhancers) to 
20 form triple helical structures that prevent transcription 
of the huchordin gene in target cells in the body (Helene 
Anticancer Drug Res. 6:569, 1981? Helene et al . , Ann. 

N.Y. Acad. Sci . 660:27, 1992? and Maher, Bioassays 
14:807, 1992) or through the use of small molecules which 
25 interfere with the expression or activity of 
transcription factors which regulate huchordin 
expression. 

Detecting Proteins Associated with Huchordin 

The invention also features polypeptides which 
30 interact with huchordin. Any method suitable for 

detecting protein-protein interactions may be employed 
for identifying transmembrane proteins, intracellular, or 
extracellular proteins that interact with huchordin. 

Among .the traditional methods which may be employed are 
35 co- immunoprecipitation, crosslinking and co-purification 
through gradients or chromatographic columns of cell 
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lysates or proteins obtained from cell lysates and the 
use of huchordin to identify proteins in the lysate that 
interact with huchordin. For these assays, the huchordin 
polypetide can be a full length huchordin, a soluble 
5 extracellular domain of huchordin, or some other suitable 
huchordin polypeptide. Once isolated, such an 
interacting protein can be identified and cloned and then 
used, in conjunction with standard techniques, to 
identify proteins with which it interacts. For example, 

10 at least a portion of the amino acid sequence of a 
protein which interacts with the huchordin can be 
ascertained using techniques well known to those of skill 
in the art, such as via the Edman degradation technique. 
The amino acid sequence obtained may be used as a guide 
15 for the generation of oligonucleotide mixtures that can 
be used to screen for gene sequences encoding the 
interacting protein. Screening may be accomplished, for 
example, by standard hybridization or PCR techniques. 
Techniques for the generation of oligonucleotide mixtures 
20 and the screening are well-known. (Ausubel, supra ; and 
"PCR Protocols: A Guide to Methods and Applications," 
Innis et al . , eds . Academic Press, Inc., NY, 1990). 

Additionally, methods may be employed which result 
directly in the identification of genes which encode 
25 proteins which interact with huchordin. These methods 
include, for example, screening expression libraries, in 
a manner similar to the well known technique of antibody 
probing of Xgtll libraries, using labeled huchordin 
polypeptide or a huchordin fusion protein, e.g., a 
30 huchordin polypeptide or domain fused to a marker such as 
an enzyme, fluorescent dye, a luminescent protein, or to 
an IgFc domain. 

There are also methods which are capable of 
detecting protein-protein interaction. A method which 
35 detects protein interactions in vivo is the two-hybrid 
system (Chien et al . , Proc : . Natl. Acad. Sci . USA, 
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88:9578, 1991) . A kit for practicing this method is 
available from Clontech (Palo Alto, CA) . 

Briefly, utilizing such a system, plasmids are 
constructed that encode two hybrid proteins : one plasmid 

5 includes a nucleotide sequence encoding the DNA-binding 
domain of a transcription activator protein fused to a 
nucleotide sequence encoding huchordin, a huchordin 
polypeptide, or a huchordin fusion protein, and the other 
plasmid includes a nucleotide sequence encoding the 
10 transcription activator protein's activation domain fused 
to a cDNA encoding an unknown protein which has been 
recombined into this plasmid as part of a cDNA library. 
The DNA-binding domain fusion plasmid and the cDNA 
library are transformed into a strain of the yeast 
15 Saccharomyces cerevisiae that contains a reporter gene 
(e.g., HBS or LacZ) whose regulatory region contains the 
transcription activator's binding site. Either hybrid 
protein alone cannot activate transcription of the 
reporter gene : the DNA-binding domain hybrid cannot 

20 because it does not provide activation function and the 
activation domain hybrid cannot because it cannot 
localize to the activator's binding sites. Interaction 
of the two hybrid proteins reconstitutes the functional 
activator protein and results in expression of the 
25 reporter ‘gene, which is detected by an assay for the 
reporter gene product . 

The two-hybrid system, three-hybrid system or 
related methodology may be used to screen activation 
domain libraries for proteins that interact with the 
30 "bait" gene product. By way of example, and not by way 
of limitation, huchordin may be used as the bait gene 
product . Total genomic or cDNA sequences are fused to 
the DNA encoding an activation domain. This library and 
a plasmid encoding a hybrid of bait huchordin gene 
35 product fused to the DNA-binding domain are cotransformed 
into a yeast reporter strain, and the resulting 



WO 99/15556 



PCT/US98/20675 



-46- 

transformants are screened for those that express the 
reporter gene. For example, a bait huchordin gene 
sequence, such as huchordin or a domain of huchordin can 
be cloned into a vector such that it is translationally 
5 fused to the DNA encoding the DNA-binding domain of the 
GAL4 protein. These colonies are purified and the 
library plasmids responsible for reporter gene expression 
are isolated. DNA sequencing is then used to identify 
the proteins encoded by the library plasmids. 

10 A cDNA library of the cell line from which proteins 

that interact with bait huchordin gene product are to be 
detected can be made using methods routinely practiced in 
the art. According to the particular system described 
herein, for example, the cDNA fragments can be inserted 
15 into a vector such that they are translationally fused to 
the transcriptional activation domain of GAL 4 . This 
library can be co-transf ormed along with the bait 
huchordin gene-GAL4 fusion plasmid into a yeast strain 
which contains a lacZ gene driven by a promoter which 
20 contains GAL4 activation sequence. A cDNA encoded 

protein, fused to GAL 4 transcriptional activation domain-, 
that interacts with bait huchordin gene product will 
reconstitute an active GAL4 protein and thereby drive 
expression of the HIS3 gene. Colonies which express HIS3 
25 can then be purified from these strains, and used to 
produce and isolate the bait huchordin gene- interacting 
protein using techniques routinely practiced in the art. 
Identification of a Huchordin Receptor 
A huchordin receptor can be identified as follows. 

30 First cells or tissues which bind huchordin are 

identified. An expression library is prepared using mRNA 
isolated from huchordin binding cells. The expression 
library is used to tranfect, eukaryotic cells, e.g., CHO 
cells. Detectably labelled huchordin is used to identify 
35 clones which bind huchordin. These clones are isolated 
and purified. The expression plasmid is then isolated 
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from the huchordin-binding clones. These expression 
plasmids will encode putative huchordin receptors. 

Identification of Compounds that Modulate the 

Expression or Activity of Huchordin 
5 Isolation of the nucleic acid molecules described 

above (i.e. those encoding huchordin also facilitates the 
identification of compounds that can increase or decrease 
the expression of these molecules in vivo. To discover 
such compounds, cells that express huchordin are 
10 cultured, exposed to a test compound (or a mixture of 

test compounds) , and the level of huchordin expression or 
activity is compared with the level of expression or 
activity in cells that are otherwise identical but that 
have not been exposed to the test compound (s) . Many 
15 standard quantitative assays of gene expression can be 
utilized in this aspect of the invention. Examples of 
these assays are provided below. 

In order to identify compounds that modulate 
expression of huchordin (or homologous genes) , the 
20 candidate compound (s) can be added at varying 

concentrations to the culture medium of cells that 
express huchordin, as described above. These compounds 
can include small molecules, polypeptides, and nucleic 
acids. The expression of huchordin is then measured, for 
25 example, by Northern blot, PCR analyses or RNAse ' 

protection analyses using a nucleic acid molecule of the 
invention as a probe. The level of expression of the 
polypeptides of the invention in the presence of the 
candidate molecule, compared with their level of 
30 expression in its absence, will indicate whether or not 
the candidate molecule alters the expression of 
huchordin . 

Similarly, compounds that modulate the expression of 
the polypeptides of the invention can be identified by 
35 carrying out the assay described above and then 
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performing a Western blot analysis using antibodies that 
bind huchordin. 

Compounds that can be screened in accordance with 
the invention include, but are not limited to peptides, 

5 antibodies and fragments thereof, and other organic 
compounds (e.g., peptidomimetics) . 

Such compounds can include, but are not limited to, 
peptides such as, for example, soluble peptides, 
including but not limited to members of random peptide 
10 libraries; (see, e.g., Lam et al . , Nature 354 : 82 , 1991; 
Houghten et al . , Nature 354:84, 1991), and combinatorial 
chemistry-derived molecular library made of D- and/or L- 
conf iguration amino acids, phosphopeptides (including, 
but not limited to, members of random or partially 
15 degenerate, directed phosphopeptide libraries; see, e.g., 
Songyang et al . , Cell 72:767, 1993), antibodies 
(including, but not limited to, polyclonal, monoclonal, 
humanized, anti -idiotypic, chimeric or single chain 
antibodies, and FAb, F(ab') 2 and FAb expression library 
20 fragments, and epitope-binding fragments thereof) , and 
small organic or inorganic molecules. 

Other compounds that can be screened in accordance 
with the invention include but are not limited to small 
organic molecules that affect the expression of the 
25 huchordin gene or some other gene involved in a pathway 
(e.g., signal transduction pathway) involving huchordin 
(e.g., by interacting with the regulatory region or 
transcription factors involved in gene expression) . 

Compounds which Bind Huchordin 
30 Compounds which bind huchordin can be identified 

using any standard binding assay. The principle of the 
assays used to identify compounds that bind to huchordin 
involves preparing a reaction mixture of huchordin and 
the test compound under conditions and for a time 
35 sufficient to allow the two components to interact and 
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bind, thus forming a complex which can be removed and/or 
detected in the reaction mixture. 

The screening assays can be conducted in a variety 
of ways. For example, one method to conduct such an 
5 assay would involve anchoring the huchordin protein, 
polypeptide, peptide or fusion protein or the test 
substance onto a solid phase and detecting huchordin/test 
compound complexes anchored on the solid phase at the end 
of the reaction. In one embodiment of such a method, 

10 huchordin may be anchored onto a solid surface, and the 
test compound, which is not anchored, may be labeled, 
either directly or indirectly. 

In practice, microtiter plates may conveniently be 
utilized as the solid phase. The anchored component can 
15 be immobilized by non-covalent or covalent attachments. 
Non- covalent attachment can be accomplished by simply 
coating the solid surface with a solution of the protein 
and drying. Alternatively, an immobilized antibody, 
preferably a monoclonal antibody, specific for the 
20 protein to be immobilized can be used to anchor the 
protein to the solid surface. The surfaces can be 
prepared in advance and stored. 

In order to conduct the assay, the nonimmobilized 
component is added to the coated surface containing the 
25 anchored component. After the reaction is complete, 

unreacted components are removed (e.g., by washing) under 
conditions such that any complexes formed will remain 
immobilized on the solid surface. The detection of 
complexes anchored on the solid surface can be 
30 accomplished in a number of ways. Where the previously 
nonimmobilized component is pre-labeled, the detection of 
label immobilized on the surface indicates that complexes 
were formed. Where the previously nonimmobilized 
component is not pre-labeled, an indirect label can be 
used to detect complexes anchored on the surface; for 
example, using a labeled antibody specific for the 



35 
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previously nonimmobilized component (the antibody, in 
turn, can be directly labeled or indirectly labeled with 
a labeled anti-Ig antibody) . 

Alternatively, a reaction can be conducted in a 
5 liquid phase, the reaction products separated from 
unreacted components, and complexes detected; for 
example, using an immobilized antibody specific for a 
huchordin protein, polypeptide, peptide or fusion protein 
or the test compound to anchor any complexes formed in 
10 solution, and a labeled antibody specific for the other 
component of the possible complex to detect anchored 
complexes . 

Alternatively, cell-based assays can be used to 
identify compounds that interact with huchordin. To this 
15 end, cell lines that express huchordin or cell lines 

(e.g., COS cells, CHO cells, fibroblasts, etc.) that have 
been genetically engineered to express huchordin {e.g., 
by transfection or transduction of huchordin DNA) can be 
used. 

20 Therapeutic Applications 

Huchordin nucleic acid molecules, polypeptides, and 
huchordin molecules capable of altering huchordin 
expression, activity, or localization can be used to 
treat a patient suffering from a disorder associated with 
25 aberrant expression or activity huchordin. Such 
compounds may be used to inhibit fibrosis or 
angiogenesis. 

Diagnostic Applications 

The polypeptides of the invention and the antibodies 
30 specific for these polypeptides are also useful for 
identifying those compartments of mammalian cells that 
contain proteins important to the function of huchordin. 
Antibodies specific for huchordin can be produced as 
described above. The normal subcellular location of the 
35 protein is then determined either in situ or using 
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fractionated cells by any standard immunological or 
immunohistochemical procedure (see, e.g., Ausubel et al . , 
supra; Bancroft and Stevens, Theory and Practice of 
Histological Techniques, Churchill Livingstone, 1982) . 

5 Antibodies specific for huchordin also can be used 

to detect or monitor huchordin-related diseases. For 
example, levels of a huchordin protein in a sample can be 
assayed by any standard technique using these antibodies. 
For example, huchordin protein expression can be 
10 monitored by standard immunological or 

immunohistochemical procedures (e.g., those described 
above) using the antibodies described herein. 
Alternatively, huchordin expression can be assayed by 
standard Northern blot analysis or can be aided by PCR 
is (see, e.g., Ausubel et al . , supra; PCR Technology; 

Principles and Applications for DNA Amplification, ed., 

H . A. Ehrlich, Stockton Press, NY) . If desired or 
necessary, analysis can be carried out to detect point 
mutations in the huchordin sequence (for example, using 
20 well known nucleic acid mismatch detection techniques) . 
All of the above techniques are enabled by the huchordin 
sequences described herein. 

In addition, the present invention encompasses 
methods and compositions for the diagnostic evaluation, 

25. typing, and prognosis of disorders associated with 

inappropriate expression or activity of huchordin. For 
example, the nucleic acid molecules of the invention can 
be used as diagnostic hybridization probes to detect, for 
example, inappropriate expression of huchordin or 
30 mutations in the huchordin gene. Such methods may be 
used to classify cells by the level of huchordin 
expression. 

Thus, the invention features a method for diagnosing 
a disorder associated with aberrant activity of 
35 huchordin, the method including obtaining a biological 
sample from a patient and measuring huchordin activity in 
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the biological sample, wherein increased or decreased 
huchordin activity in the biological sample compared to a 
control indicates that the patient suffers from a 
disorder associated with aberrant activity of huchordin. 

5 High density oligonucleotide probe arrays can be 

used to detect mutations or polymorphism in the huchordin 
gene. A tiling array (Cronin et al . , Human Mutation 
7:244, 1996; Kozal et al . , Nature Med. 2:753, 1996) can 
be used to location mutations anywhere in the gene. A 
10 mutation array (Cronin et al . , Human Mutation 7:244, 

1996) can be used to detect the presence of previously 
identified mutations. 

The present invention further provides for 
diagnostic kits for the practice of such methods, 
is Effective Dose 

Toxicity and therapeutic efficacy of the 
polypeptides of the invention and the compounds that 
modulate their expression or activity can be determined 
by standard pharmaceutical procedures, using either cells 
20 in culture or experimental animals to determine the LD S0 
(the dose lethal to 50% of the population) and the ED 50 
(the dose therapeutically effective in 50% of the 
population) . The dose ratio between toxic and 
therapeutic effects is the therapeutic index and it can 
25 be expressed as the ratio LD 50 /ED 50 . Polypeptides or other 
compounds that exhibit large therapeutic indices are 
preferred. While compounds that exhibit toxic side 
effects may be used, care should be taken to design a 
delivery system that targets such compounds to the site 
30 of affected tissue in order to minimize potential damage 
to uninfected cells and, thereby, reduce side effects. 

The data obtained from the cell culture assays and 
animal studies can be used in formulating a range of 
dosage for use in humans. The dosage of such compounds 
35 lies preferably within a range of circulating 

concentrations that include the ED S0 with little! or no 
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toxicity. The dosage may vary within this range 
depending upon the dosage form employed and the route of 
administration utilized. For any compound used in the 
method of the invention, the therapeutically effective 
5 dose can be estimated initially from cell culture assays. 

• A dose may be formulated in animal models to achieve a 
circulating plasma concentration range that includes the 
IC 50 (that is, the concentration of the test compound 
which achieves a half -maximal inhibition of symptoms) as 
10 determined in cell culture. Such information can be used 
to more accurately determine useful doses in humans. 
Levels in plasma may be measured, for example, by high 
performance liquid chromatography. 

Formulations and Use 

15 Pharmaceutical compositions for use in accordance 

with the present invention may be formulated in 
conventional manner using one or more physiologically 
acceptable carriers or excipients. 

Thus, the compounds and their physiologically 
20 acceptable salts and solvates may be formulated for 
administration by inhalation or insufflation (either 
through the mouth or the nose) or oral, buccal, 
parenteral or rectal administration. 

For oral administration, the pharmaceutical 
25 compositions may take the form of, for example, tablets 
or capsules prepared by conventional means with 
pharmaceutically acceptable excipients such as binding 
agents (for example, pregelatinised maize starch, 
polyvinylpyrrolidone or hydroxypropyl methylcellulose) ; 

30 fillers (for example, lactose, microcrystalline cellulose 
or calcium hydrogen phosphate); lubricants (for example, 
magnesium stearate, talc or silica ) ; disintegrants (for 
example, potato starch or sodium starch glycolate) ; or 
wetting agents (for example, sodium lauryl sulphate) . 

35 The tablets may be coated by methods well known in the 
art. Liquid preparations for oral administration may 
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take the form of, for example, solutions, syrups or 
suspensions, or they may be presented as a dry product 
for constitution with water or other suitable vehicle 
before use. Such liquid preparations may be prepared by 
5 conventional means with pharmaceutically acceptable 
additives such as suspending agents (for example, 
sorbitol syrup, cellulose derivatives or hydrogenated 
edible fats) ; emulsifying agents (for example, lecithin 
or acacia) ; non- aqueous vehicles (for example, almond 
10 oil, oily esters, ethyl alcohol or fractionated vegetable 
oils) ; and preservatives (for example, methyl or propyl - 
p-hydroxybenzoates or sorbic acid) . The preparations may 
also contain buffer salts, flavoring, coloring and 
sweetening agents as appropriate, 
is Preparations for oral administration may be suitably 

formulated to give controlled release of the active 
compound . 

For buccal administration the compositions may take 
the form of tablets or lozenges formulated in 
20 conventional manner. 

For administration by inhalation, the compounds for 
use according to the present invention are conveniently 
delivered in the form of an aerosol spray presentation 
from pressurized packs or a nebulizer, with the use of a 
25 suitable propellant, for example, 

dichlorodif luorome thane, trichlorof luoromethane, 
dichlorotetraf luoroethane, carbon dioxide or other 
suitable gas. In the case of a pressurized aerosol the 
dosage unit may be determined by providing a valve to 
30 deliver a metered amount. Capsules and cartridges of, 

for example, gelatin for use in an inhaler or insufflator 
may be formulated containing a powder mix of the compound 
and a suitable powder base such as lactose or starch. 

The compounds may be formulated for parenteral 
35 administration by injection, for example, by bolus 
injection or continuous infusion. Formulations for' 
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injection may be presented in unit dosage form, for 
example, in ampoules or in multi-dose containers, with an 
added preservative. The compositions may take such forms 
as suspensions, solutions or emulsions in oily or aqueous 
5 vehicles, and may contain formulatory agents such as 
suspending, stabilizing and/or dispersing agents. 
Alternatively, the active ingredient may be in powder 
form for constitution with a suitable vehicle, for 
example, sterile pyrogen-free water, before use. 

10 The compounds may also be formulated in rectal 

compositions such as suppositories or retention enemas, 
for example, containing conventional suppository bases 
such as cocoa butter or other glycerides. 

In addition to the formulations described 
15 previously, the compounds may also be formulated as a 
depot preparation. Such long acting formulations may be 
administered by implantation (for example subcutaneously 
or intramuscularly) or by intramuscular injection. Thus, 
for example, the compounds may be formulated with 
20 suitable polymeric or hydrophobic materials (for example 
as an emulsion in an acceptable oil) or ion exchange 
resins, or as sparingly soluble derivatives, for example, 
as a sparingly soluble salt. 

The compositions may, if desired, be presented in a 
25 pack or dispenser device which may contain one or more 
unit dosage forms containing the active ingredient. The 
pack may for example comprise metal or plastic foil, such 
as a blister pack. The pack or dispenser device may be 
accompanied by instructions for administration. 

30 The therapeutic compositions of the invention can 

also contain a carrier or excipient, many of which are 
known to skilled artisans. Excipients which can be used 
include buffers (for example, citrate buffer, phosphate 
buffer, acetate buffer, and bicarbonate buffer) , amino 
35 acids, urea, alcohols, ascorbic acid, phospholipids, 
proteins (for example, serum albumin) , EDTA, sodium 
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chloride, liposomes, mannitol, sorbitol, and glycerol. 

The nucleic acids, polypeptides, antibodies, or 
modulatory compounds of the invention can be administered 
by any standard route of administration. For example, 

5 administration can be parenteral, intravenous, 

subcutaneous, intramuscular, intracranial, intraorbital, 
opthalmic, intraventricular, intracapsular, intraspinal, 
intracisternal, intraperitoneal , transmucosal , or oral . 
The modulatory compound can be formulated in various 
10 ways, according to the corresponding route of 

administration. For example, liquid solutions can be 
made for ingestion or injection; gels or powders can be 
made for ingestion, inhalation, or topical application. 
Methods for making such formulations are well known and 
15 can be found in, for example, "Remington's Pharmaceutical 
Sciences." It is expected that the preferred route of 
administration will be intravenous. 

Example 

Described below is the identification, sequencing, 

20 and characterization of a human huchordin gene. 

A novel open reading frame was identified during 
genomic sequencing of a human bacterial artificial 
chromosome . The open reading frame was located 
approximately 4 kb upstream of the thrombopoietin gene. 

25 A genomic fragment within the open reading frame was used 
to probe, a human brain cDNA library (Clontech; Palo Alto, 
CA) . A near full-length cDNA clone, lacking only two 
nucleotides of the initial Met codon, was identified. 

The identity of the missing nucleotides was confirmed by 
30 comparison to the genomic sequence. The cDNA clone 

encoded a 867 amino acid protein. The cDNA sequence of 
huchordin is shown in Fig. 1 (SEQ ID N0:1). The 
huchordin encoding portion of this cDNA extends from 
nucleotide 1 to nucleotide 2601 (SEQ ID N0:3) . The amino 
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acid sequence of huchordin is also shown in Fig. 1 (SEQ 
ID NO: 2) . 

Huchordin is predicted to be a secreted protein 
having a signal sequence extending from amino acid 1 to 
5 amino acid 26. At the amino acid level, huchordin is 53% 
identical to Xenopus chordin (Sasai et al., Cell 79:779, 
1994) . Fig. 3 is an alignment of a portion of the amino 
acid sequence of huchoridin and a portion of the amino 
acid sequence of Xenopus chordin (SEQ ID NO: 4) . Variants 
10 of huchordin which are more likely to retain activity do 
not have alterations at the amino acid positions 
conserved between huchordin and chordin. 

A human Northern blots (Clontech; Palo Alto, CA) 
probed with a full-length huchordin cDNA clone revealed 
15 the presence of an approximately 7 . 5 kb transcript in 
adult liver and fetal liver, an approximately 2.7 kb 
transcript in fetal liver, and an approximately 4.4 kb 
transcript in brain, heart, and pancreas. 

As noted above, huchordin has homology to Xenopus 
20 chordin, a secreted molecule that functions as a 
. dorsalizing factor in early embryo development. Chordin 
binds and antagonizes BMP-4, a member of the TGF-beca 
superfamily. 

Huchordin may bind members of the TGF-beta 
25 superfamily, e.g., TGF-beta. To the extent that 

huchordin (or fragments thereof) bind TGF-beta, huchordin 
can be used to reduce TGF-beta activity, for example, to 
reduce fibrosis of the kidney, liver, or lung. 

The cysteine rich repeats of huchordin are found in 
30 thrombospondin- 1, thrombospondin-2, and procollagen, 
protein with anti -angiogenic activity. Thus, huchordin 
(or fragments thereof which include one or more of the 
cysteine rich repeats)- can be used to inhibit 
angiogenesis. Such inhibition is useful in limiting 
35 tumor growth. 
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Deposit Statement 

E. coli strain fth66 harboring a huchordin cDNA 
clone was deposited with the American Type Culture 
Collection on July 2, 1997 and given ATCC Accession No. 

5 98481. 

This culture has been deposited under conditions 
that assure that access to the culture will be available 
during the pendency of the patent application to one 
determined by the Commissioner of Patents and Trademarks 
10 to be entitled thereto under 37 CFR 1.14 and 35 U.S.C. 

122 . The deposit is available as required by foreign 
patent laws in countries wherein counterparts of the 
subject application, or its progeny, are filed. However, 
it should be understood that the availability of the 
15 deposit does not constitute a license to practice the 
subject invention in derogation of patent rights granted 
by governmental action. 

Further, the culture deposit will be stored and made 
available to the public in accord with the provisions of 
20 the Budapest Treaty for the Deposit of Microorganisms, 
i.e., it will be stored with all the care necessary to 
keep it viable and uncontaminated for a period of at 
least five years after the most recent request for the 
furnishing of a sample of the deposits, and in any case, 
25 for a period of at least 30 (thirty) years after the date 
of deposit or for the enforceable life of any patent 
which may issue disclosing the culture plus five years 
after the last request for a sample from the deposit. 

The depositor acknowledges the duty to replace the 
30 deposit should the depository be unable to furnish a 
sample when requested, due to the condition of the 
deposit. All restrictions on the availability to the 
public of the deposit will be irrevocably removed upon 
the granting of a patent disclosing it. 
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What is claimed is: 

1. An isolated nucleic acid molecule selected from 
the group consisting cf a nucleic acid molecule encoding 
the amino acid sequence of SEQ ID NO: 2 and a nucleic acid 

5 molecule which hybridizes under stringent conditions to 
the nucleic acid molecule of SEQ ID N0.-3, said 
hybridizing nucleic acid molecule having the sequence of 
a naturally-occurring mammalian gene. 

2 . An isolated nucleic acid molecule encoding amino 
10 acids 27-867 of SEQ ID NO:2. 

3 . An isolated nucleic acid molecule encoding a 
polypeptide having sequence that is at least 85% 
identical to the sequence of SEQ ID NO:2. 

4. An isolated nucleic acid molecule encoding the 
15 amino acid sequence of SEQ ID NO: 2. 

5. The isolated nucleic acid molecule of claim 1 , 
the molecule comprising the nucleotide sequence of SEQ ID 
NO : 3 . 



6. The isolated nucleic acid molecule of claim 1, 
20 the molecule hybridizing under stringent conditions to a 

nucleic acid molecule having the sequence of SEQ ID NO: 3 
or its complement. 

7 . The isolated nucleic acid molecule of claim 1 
comprising the protein coding portion of the huchordin 

25 gene contained in A.T.C.C. Deposit No. 98481. 
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8. The isolated nucleic acid molecule of claim 1, 
wherein the isolated nucleic acid molecule hybridized to 
the nucleotide sequence of the protein coding portion of 
the huchordin gene contained in A.T.C.C. Deposit No. 

5 98481. 

9. A host cell comprising the isolated nucleic acid 
molecule of claim 1. 

10. A nucleic acid vector comprising the nucleic 
acid molecule of claim 1. 

io 11. The nucleic acid vector of claim 10, wherein 

the vector is an expression vector. 

12. The vector of claim 11, further comprising a 
regulatory element. 

13. The vector of claim 12, wherein the regulatory 
15 element is selected from the group consisting of the 

cytomegalovirus hCMV immediate early gene, the early 
promoter of SV25 adenovirus, the late promoter of SV25 
adenovirus, the lac system, the trn system, the TAC 
system, the TRC system, the major operator and promoter 
20 regions of phage X, the control ' regions of fd coat 

protein, the promoter for 3-phosphoglycerate kinase, the 
promoters of acid phosphatase, and the promoters of the 
yeast a-mating factors. 

14 . The vector of claim 12 , wherein the regulatory 
25 element directs tissue-specific expression. 

15. The vector of claim 11, further comprising a 
reporter gene. 
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16. The vector of claim 15, wherein the reporter 
gene is selected from the group consisting of 

/3- lactamase, chloramphenicol acetyltransf erase (CAT) , 
adenosine deaminase (ADA) , aminoglycoside 
5 phosphotransferase (nec r , G403 r ) , dihydrofolate reductase 
(DHFR) , hygromycin-B-phosphotransferase (HPH) , thymidine 
kinase (TK) , lacZ (encoding j8-galactosidase) , and 
xanthine guanine phosphoribosyltransferase (XGPRT) . 

17. The vector of claim 10, wherein the vector is a 
10 plasmid. 

18. The vector of claim 10, wherein said vector is 
a virus . 

19. The vector of claim 18, wherein said virus is a 
retrovirus . 

15 20. A substantially pure polypeptide selected from 

the group consisting of a polypeptide having the sequence 
of SEQ ID NO: 2 and a polypeptide encoded by a nucleic 
molecule which hybridizes under stringent conditions to 
the nucleic acid molecule of SEQ ID NO: 3, said 
20 hybridizing nucleic acid molecule having the sequence of 
a naturally-occurring mammalian gene. 

21. The polypeptide of claim 20, wherein the 
polypeptide being soluble under physiological conditions. 

22. The polypeptide of claim 20, said polypeptide 
25 comprising an amino acid sequence that is at least 80% 

identical to the amino acid, sequence of SEQ ID N0:2. 

23. The polypeptide of claim 20, said polypeptide 
comprising an amino acid sequence that is at least 95% 
identical to the amino acid sequence of SEQ ID N0:2.. : 
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24. The polypeptide of claim 20, said polypeptide 
comprising an amino acid sequence that is identical to 
the amino acid sequence of SEQ ID NO: 2. 

25. The polypeptide of claim 20, wherein the 
5 polypeptide comprises the cyteine-rich domains of 

huchordin. 

26. A substantially pure polypeptide comprising a 
first portion and a second portion, said first portion 
comprising a huchordin polypeptide and said second 

10 portion comprising a detectable marker. 

27 . An antibody that specifically binds to a 
huchordin polypeptide. 

28. The antibody of claim 27, wherein said antibody 
is a monoclonal antibody. 

15 29. A pharmaceutical composition comprising the 

polypeptide of claim 20. 

30. A method for detecting huchordin in a sample, 
said method comprising: 

(a) obtaining a biological sample; 

20 (b) contacting said biological sample with an 

antibody that specifically binds huchordin under 
conditions that allow the formation cf huchordin- antibody 
complexes ; and 

(c) detecting said complexes, if any, as an 

25 indication of the presence of huchordin in said sample. 
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31. A method of identifying a compound that 
modulates the expression of huchordin, said method 
comprising comparing the level of expression of huchordin 
in a cell in the presence and absence of a selected 

5 compound, wherein a difference in the level of expression 
in the presence and absence of said selected compound 
indicates that said selected compound modulates the 
expression of huchordin. 

32. A method of identifying a compound that 
10 modulates the activity of huchordin, said method 

comprising comparing the level of activity of huchordin 
in a cell in the presence and absence of a selected 
compound, wherein a difference in the level of activity 
in the presence and absence of said selected compound 
15 indicates that said selected compound modulates the 
activity of huchordin. 

33. A method for treating a patient suffering from 
a disorder associated with excessive expression or 
activity of huchordin, comprising administering to said 

20 patient a compound which inhibits expression or activity 
of huchordin. 

34. A method for treating a patient suffering from 
a disorder associated with insufficient expression or 
activity of huchordin, comprising administering to said 

25 patient a compound which increases expression or activity 
of huchordin. 
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35. A method for diagnosing a disorder associated 
with aberrant expression of huchordin, comprising 
obtaining a biological sample from a patient and 
measuring huchordin expression in said biological sample, 

5 wherein increased or decreased huchordin expression in 
said biological sample compared to a control indicates 
that said patient suffers from a disorder associated with 
aberrant expression of huchordin. 

36. A method for diagnosing a disorder associated 
io with aberrant activity of huchordin, comprising obtaining 

a biological sample from a patient and measuring 
huchordin activity in said biological sample, wherein 
increased or decreased huchordin activity in said 
biological sample compared to a control indicates that 
is said patient suffers from a disorder associated with 
aberrant activity of huchordin. 

37. An isolated nucleic acid molecule which 
hybridizes to a nucleic acid molecule having the sequence 

.of nucleotides 182 to 850, inclusive, of SEQ ID NO:l or 
20 its complement. 

38. An isolated nucleic acid molecule having a 
sequence which is at least 95% identical to the sequence 
of nucleotides 182 to 850, inclusive of SEQ ID NO:l. 

39. A polypeptide encoded by the nucleic acid 
25 molecule of any of claims 34 or 35. 
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M ? S L P A ? 

ATG Cwo *»:C CCG GCC CCG 

S A ? A R G A 

TCC CGG CCG GCC CGC GGC GCC 

E ? L P v R g 

GAG CCG CTG CCC GTT CGG GGA 

D 2 T W H P D 

GAC GAG ACG TGG CAC CCG GAC 

A C H A P Q W 

GCC TGC GAG GCG CCT CAG TGG 

N I K ? E C P 

AAC ATC AAA CCA GAG TCC CCA 

C C Q T C P Q 

TGC TGC CAG ACC TGC CCC CAG 

fey prop 

TTC GAG TAT CCG CGG GAC CCG 

EE R A R G . D 

GAG GAG CGG GCC CGT GGT GAC 

S Q A V A R A 

TCG CAG GCG GTG GCA CGA GCC 

S Y R R L D R 

TCC TAC AGG CGG CTG GAC CGC 

L F E H ? A A 

CTG TTT GAG CAC CCT GCA GCC 

V ? R L S L R 

GTG CCT CGG TTO TCT CTG CGG 

L T H P S G E 

CTC ACT CAC CCT -TCA GGG GAG 

E T ■ F -S A I L 

GAG ACC, TTC ACT GCC ATC CTG 

T L L T L S D 

ACC CTG CTC ACT CTC ACT GAC 

L L E ? R . S G 

CTG CTG GAA CCC AGG AGT GGG 

Q G ' 0 L L R E 

CAG GGG CAG CTA CTG CGA GAA 

E V L ? N L T 

GAG GTG CTG CCC AAC CTG ACA 



P *?LLLLG 

CCG GCC CCG CTG CTC CTC CTC GGG 

G P E P P v L- ? 
GGC CCA GAG CCC CCC GTG CTG CCC 

AAGCTFGG 
GCG GCA GGC TGC ACC TTC GGC GGG 

lgepfgvm 

CTA GGG GAG CCA TTC GGG GTG ATG 

gr *TRG?G 
GGT CGC CGT ACC AGG GGC' CCT GGC 

T P A C G Q p r 

ACC CCG GCC TGT GGG CAG CCG CGC 

erssserq' 
GAG CGC AGC AGT TCG GAG CGG CAG 

' E H ?. S Y S D R 

GAG CAT CGC AGT TAT AGC GAC CGC 

CHTDFVAL. 
GGC CAC ACG GAC TTC GTG GCG CTG 

kvsll rss 

CGA GTC TCG CTC CTG CGC TCT AGC 

.ptrirfsd 

CCT ACC AGG ATC CGC TTC TCA GAC 

PtQDGLVC 

CCC ACC CAA GAT GGC CTG GTC TG7 

“^-RAEQLH 
CTC CTT AGG GCA GAA CAG CTG CAT 

VWGPLIRH 
GTC TGG GGG CCT CTC ATC CGG CAC 

T ^EGpp ;qq 
ACT CTA GAA GGC CCC CCA CAG CAG 

tedslkfl 

ACA GAG GAC TCC TTG CAT TTT TTG 

G L T Q V P L R 

GGA CTA ACC CAG GTT CCC TTG AGG 

lqawvsaq 

C* * CAG GCC AAT GTC TCA GCC CAG 

v Q E M, 0 W L V 

GTC CAG CAG ATG GAC TGG CTG GTG 



L L L L G 20 

CTG CTG CTG CTC CGC 50 

-PEEK 40 
ATC CGT TCT GAG AAG 120 

K V Y A L 60 
AAG GTC TAT GCC TTG 130 

R C V L C 30 

CGC TGC CTG CTG TGC 240 

R V S C K 100 

AGG GTC AGC TGC AAG 300 

Q L P G H 120 

CAG CTG CCG GGA CAC 3 60 

? S G L S 140 

CCG AGC GGC CTG TCC 420 

C E P G A 160 

GGG GAG CCA GGC GCT 480 

L T G P R 130 

CTG ACA GGG CCG AGG 540 

L R F S I 200 

CTC CGC TTC TCT ATC 600 

S N G S V 220 

TCC AAT GGC AGT GTC 660 

G V W R A 240 

GGG GTG TGG CGG GCA 720 

V A L V T 260 

GTG GCA CTT GTG ACA "80 

R A L A A 280 

CGG GCC CTG CCT GCA 840 

G V G G I 300 

GGC GTA GGG GGC ATC 300 

L L F R G 320 

CTG CTC TTC CGA GGG 960 

L Q I L H 340 

CTC CAG ATT CTA CAC 1020 

E P G F A 360 

GAA. CCA GGC TTT GCT 1080 

Z- G E L Q 380 

CTG GGG GAG CTG CAG 1140 
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M A L 2 W A 

ATS GCC CTG GAG TGG GCA 

K S C 0 V i 

AAG A GC TGC GAC GTC CTC 

T G A A G S 

ACG GGT GCT GCC GGC TCA 

V Q V V G T 

GTG CAA GTG G7A GGG ACA 

R R D Q R T 

CGG AGG GAT CAG CGC ACT 

A V G I C p 

GCC GTG GGT ATC TGC CCT 

F L iJ V G 

CTC TTC CTG AAC GTG GGC 

A 1 ? Y C . G 

GCC CTG CCC TAC TCT GGG 

A L V L P ? 

GCC CTG GTG CTA CCC CCT 

T H C H L H 

ACC CAC TCT CAC CTG CAC 

T V T A H L 

ACT GTC ACT GCC CAC CTC 



G R P G L R : 

GGC AGG CCA GGG CTG CGC ATC 

Q S V L C G A 

CAA ACT GTC CTT TGT GGG GCT 

A S L T L L G 

GCC AGC CTC ACG CTC CTA GGA 

s S 2 V V A M 

AGC AGT GAG GTG GTG GCC ATG 

V L C H M A G 

GTC CTG TGC CAC ATG GCT GGA 

G L G A R g A 

GGG CTG GGT GCC CGA GGG GCT 

T * *3 F P D G 

ACC AAG GAC TTC OCA GAC CGA 

H S A R H D T 

CAT AGC GCC CGC CAT GAC ACG 

V K S Q A A G 

GTG AAG AGC CAA GCA GCA GGG 

V E V L L A G 

TAT GAA GTG CTG CTG GCT GGG 

^ G p ? G T p 

CTT GGG CCT CCT GGA ACG CCA 



% S G H - A A R 

AGT GGA CAC ATT GCT GCC AGG 

D A L I ? v Q 

GAT GCC CTG ATC CCA GTC CAG 

N ‘ C S L I Y q 

AAT GGC TCC CTG ATC TAT CAG 

T L S T K P Q 

ACA CTG GAG ACC AAG CCT CAG 

L 0 P G G H T 

CTC CAG CCA GGA GGA CAC ACG 

H M L L Q M £ 

CAT ATG CTG CTG CAG AAT GAG 

£ L R G H V A 

GAG CTT CGG GGG CAC GTG GCT 

L S . V P L A G 

CTG TCC GTG CCC CTA GCA GCA 

H A W L S L D 

CAC GCC TGG CTT TCC TTG GAT 

L G G S E Q G 

CTT GGT GGC TCA GAA CAA GGC 

G ? R R L L K 

GGG CCT CGG CGG CTG CTG AAG 



400 

1200 



420 
12 SO 



440 

1220 

460 

1330 

480 

.1440 

500 

1500 

520 

1550 

S40 

1520 

560 

1680 

580 

1740 

600 

1800 



G F Y 
GGA TTC TAT 

H L A 
CAC CTG GCA 

L R G 
CTC CGA GGG 

P H P 
CCA CAC CCG 

R D L 
AGA GAC TTG 

D R S 
GAC CGG AGC 

I X c 

ATT AAG TGT 

Q C P 
CAG TGT CCC 



G S 
GGC TCA 

K G 
AAA GGC 

0 R 
CAG AGA 

V Q 
GTG CAG 

P G 
CCA GGG 

W R 
TGG CGG 

A V‘ 
GCT GTC 

R L 
CGG CTG 



E A 
GAG GCC 

M A 
ATG GCC 

R T 
CGA ACG 

A P 
GCT CCC 

L P 
CTG CCA 

A A 
GCA GCG 

C T 
TGC ACC 

A C 
GCC TGT 



Q G V 
CAG GGT GTG 

S L M 
TCC CTG ATG 

Vic 

GTG ATC TGT 

D Q C 
GAC CAG TGC 

R S R 
AGG AGC CGG 

G T R 
GGT ACG CGG 

C K G 
TGC AAG GGG 

A Q ? 
GCC CAG CCT 



V K D L 
GTG AAG GAC CTG 

I T T K 
ATC ACC ACC AAG 

D ? v V 
GAC CCG GTG GTG 

C P V c 
TGC CCT GTT TGC 

D p g £ 

GAC CCA GGA GAG 

W H P V 
TGG CAC CCC GTT 

G T G E 
GGC ACT GGA GAG 

V R v N 
GTG CGT CTC AAC 



2 ? 
GAG CCG 

G 3 
GGT AGC 

C ? 
TGC CCA 



CCT GAG 

G C 
GGC TGC 

V ? 
GTG CCC 

V K 
GTG CAC 

P T 
CCC ACC 



E L L R 
GAA CTG CTG CGC 

P R G 2 
CCC AGA GGG GAC 

P P S C 
CCC CCC AGC TGC 

X Q D V 
AAA CAA GAT GTC 

Y F D G 
TAT TTT GAT GGT 

P F G 1 
CCC TTT GGC CTA 

C E -K V 
TGT GAG AAG CTG 

D C C X 
jAC TGC TGC AAA 



620 

1860 

640 

1920 

660 

1980 



680 

2040 

700 

2100 

720 

2160 

’ 740 

2220 

750 

2280 
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3“?vosgahp^* - 

CAG TGT CCA GTG GGG TCG GGC GCC CAC CCC M cX. JL ° P M Q A 3 G 

W ' C CuC ^ CTG GGG CAC CCC ATG CAG GCT GAT GGG 

?R 0 CRFA ^QWF?tr, nt . fI .. 

CCC CGG GGC TGC CGT TTT GC7 GGG rar. -\ nr^ y. ' -S CSWH P 3 

CAG TGj TTC ^C«. GAG ACT CAG AGO TGG CAC CCC CCA 

V ??FGEMsciTr3^^, 

CTG CCC CCT TTT GGA GAG ATG A/“- ___ , G A G V ? H 

*jT «TC «GA TGT GGG GCA GGG GTG CCT CAC 

CER:>DCS I-Pl.Srr-e^ 

TGT CAG CGG GAT CAC TGT TCA CCC CCA CTG TCC TGT GGC TC3 GGG AAG cL ACT cL T^C 

^ ~ i 4= Jc « « c~ i d 4 cL M « A ^ ^ £ cic 

.2 :< E A E G S » 

GAG AAA GAA GCC GAA GGC TCT TAG 

^a^a^c^^^ccaagtgaccaagacgatggggcc tgagctggggaaggggtggcatcgaggaccttcttgcatt 

-TGGGAAv^CCAGTGCCTTTGC . uCTCTCTCCTGCCTCTACTCCCACCCCCACTACCTrTGGGAACCACAGCTC 

CCTGCCACCCTCGGCCTCTGTCC 



TTGGAxGCCCCACCCCTTTCCT-CTGTACATAATGTCACTGGCTrGTTGGGATnTTAATTTATCITCACT CAGCACCA 

AGGGCCCCCGACACTCCACTCCTGCTGCCCCTGAGCTGAGCAGAGTCATTATTGGACAGTJTIGTATTTATTAAAACAT 

TTCTTTTTCAGTCAAAAAAAAAAAAAAAGGGCGGCCG'C 



780 

2340 

900 

2400 

520 

2460 

340 

2520 

360 

2580 

868 

2604 

2583 

2762 

2841 

2920 

2999 

3037 
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GTSSEWAMTLETKPQRRDQRTVLCHMAGLQPGGHTAVGI CFGLGARGAH 499 

II I I • I = I I I I I I • I = • ■ I • = | • I • - : | = | . : ,||= 1 

GTMSTVTAVTLETKPFRKTKRNILHDMSKDYHDGR . VWGYWIDAHARDLH 492 

MLLQNELFLNVGTKDFPDGELRGHVAALPYCGHSARHDTLSVFLAGALVL 549 

1 1 1 1 ■ 1 1 1 1 1 1 • 1 1 1 1 • = II 1 1 1 = = • = I hi - 1 1 • = • I • 1 1 1 1 1 • = I 

MLLQSELFLNVATKDFQEGELRGQITPLLYSGLWARYEKLPVPLAGQFVS 542 
PPVKSQAAGHAWLSLDTHCHLHYEVLLVGLGGSEQGTVTAHLLG 593 

I I = = • - - I I I I I » I 1 I • I I 1 I I 1 = = = = - I I I • I : 1 ■ : • I I I I 
PPIRTGSAGHAWSLDEHCHLHYQIVVTGLGKAEDAALNAHLKGFAELGE 592 

. PPGTPGPRRLLKGFYGSEAQGWKDLEPELLRHLAKGMASLMITTKGSP 642 

• = -ll- = lllllllllllH lllh III ll- = l l : =.|| 

VGES SPGHKRLLKGFYGSEAQGSVKDLDLELLGHLSRGTAFI QVSTKLNP 642 

RGELRG 648 

Uhl I 

RGEIRGQIHIPNSCESGGVSLTPEEPEYEYEIYEEGRQRDPDDLRKDPRA 692 
QRRTVICDPWCPPPSCPHFVQA 671 

h I I I I I I I : I I I I ■ I • 1 I I : 

CSFEGQLRAHGSRWAPDYDRKCSVCSCQKRTVICDPIVCPPLNCSQPVHL 742 
PDQCCPVCPEKQDVRDLPGLFRSRDPGEGCYFDGDRSWRAAGTRWHPWP 721 

I I I I I I I hi I • : H : : • -hi -111:1111111:11111111.11 

PDQCCPVCEEKKEMREVKKPERAR. TSEGCFFDGDRSWKAAGTRWHPFVP 791 
PFGLIKCAVCTCKGG7GEVHCEKVQCPRLACAQPVRVNPTDCCKQCPVGS 771 

I I I I I I II : I I I I l : I I I II I I I I Ihhh.hhlhlllllllh. 

PFGLIKCAICTCKGSTGEVHCEKVTCPKLSCTNPIRANPSDCCKQCPVEE 841 
GAHPQlGDPMQADGPRGCF.FAGQWFPESQSWHPSVPP t7 GEMSCIT r 'RCGA 821 

■ =1 = 1-11-11= =111= =1 = 1= =-111-11111 l|.| = l M:=- 

RSPMELADSMQSDGAGSCRFGEHWYPNHERWHPTVPFFGEMKCVTCTCAE 891 
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APPAPLLLLGLLLLGSRPARGAGPEPPVLPIRSEKEPLPVRGAAGCTFGG 6 0 
•l : I =• = Ill- -I- III- ; : 1 1 1 1 1 1 1 

QCPPIijajVWTLWIM. . . . AVDCSRPKVFLPIQPEQEPLQSKTPAGCTFGG 47 

KVYALDETWHPDLGEPFGVMRCVLCACEAFQWGRRTRGPGRVSCKNIKPE 110 
I • I • I : : • M I I I I I II I I I : | | | | II I I : : | | • : • - I : I I I I I I I . : 

KFYSLEDSWHPDLGEPFGVMHCV'LCYCS . PQRSRRGKPSGKVSCKNIKHD 96 
CPTPACGQPRQLPGHCCQTCPQERSS3ERQPSGL. . 3FEYPRDPEHRSYS 158 

I I • I • I : • I II III. Ill : : | || |. 

CPS PS CANP ILLPLHCCKT CPKAPPPP I KKSDFVFDGFEYFQEKDDDLYN 146 
DRGE PGAEERARGDGHTD FVALLTGPR SQAVARARVSLLRSSLR 202 

1 1 : l ; = = = - = : 1 1 1 1 1 = i • • = 1 1 = 1 1 - ■ i 1 1 . i 

DRSYL3SDDVAVEESRSEYVALLTAPSHVWPPVTSGVAKARFNLQRSNLL 196 

FSISYRRLDRPTRIRFSDSNGSVLFEHPA. . . APTQDGLVCGVWRAVPRL 249 

I I I • I : : : I I '111111 : I I I I I I | | • : • • • | : : | | : | | . : | 

FSITYKWIDRLSRIRFSDLDGSVLFEHPVKRMGSPRDDTICGIWRSLNRS 246 
• SLRLLRAEQLHVALVTLTHPSGEVWGPLIRHRALAAETFSAILTLEG' D ' D Q 299 

•Mill Mill f • • • I : • | . : * : | : | | .|.|||:|| |:“: 

TLRLLRMGH I LVSLVTTTLS E PE I SGKI VKHKALFS ES FS ALL TPEDSDE 296 

QGVGG ITLLTLSDTEDSLHFLLLFRGLLEPRSGGL7QVPLRLQILHQGQL 349 

M|:.MIIIMMMIM|::||| :: |:|: :|| ||.:: 

TGGGGIiAMLTLSDVDDNLHFILMLRGLSGEEGD . . . QIPILVQISHQNHV 343 

LRELQANVSAQEPGFAEVLPNLTVQEMDWLVLGELQMALEWAGRPGLRI S 3 99 
Mil I I M II I • M I I I I I M • Ml IM | : I : : : : MM- -M 
IRELYANISAQEQDFAEVLPDLSSREMLWLAQGQLEI3VQTEGRRPQSMS 3 93 

GHIAARK3CDVLQSVLCGADALIPVQTGAAGSASLTLLGNGSLIYQVQW ^49 

! I • • I I I I I • I I I I I ' M I I I * * I i I ■ I I | | : | | : | M | | « | * - 

GI ITl’RKSCDTLQSVLSGGDALNPTKTGAVGSASITLHENGTLEYQIQIA 443 
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GVPHCERDDCSLPLSCGSGKESRCCSRC TAHRRPAPETRTDPEL 8 S5 

I : • : I I : : I • • : • • s s I • I II • : I • • • : ... | | | | . . s 

GITQCRRQECTGTTCGTGSKRDRCCTKCKDANQDEDEKVKSDETRTFWSF 941 
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SEQOENCE LISTING 



(1) GENERAL INFORMATION 

(i) APPLICANT: MILLENNIUM BIOTHERAPEUTICS, INC. 

(ii) TITLE OF THE INVENTION: HUCHORDIN AND USES THEREOF 
5 ( iii) NUMBER OF SEQUENCES: 4 

<iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Fish & Richardson P.C. 

(B) STREET: 225 Franklin Street 

(C) CITY: Boston 

10 (D) STATE: MA 

(E) COUNTRY: USA 

(F) ZIP: 02110-2804 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

15 (B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: Windows 9 5 

(D) SOFTWARE: FastSEQ for Windows Version 2.0 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT/US98/ 

20 (B) FILING DATE: 28 September 1998 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: U.S. Serial No. 08/938,365 

(B) FILING DATE: 26 September 1997 

25 (viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Meiklejohn, Ph.D., Anita L. 

(3) REGISTRATION NUMBER: 35,283 

(C) REFERENCE/DOCKET NUMBER: 09404/040W01 

(ix) TELECOMMUNICATION INFORMATION: 

30 (A) TELEPHONE: 617/542-5070 

(3) TELEFAX: 617/542-8906 
(C) TELEX: 200154 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

35 (A) LENGTH: 3037 base pairs 

(3) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

40 (ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 1...2601 

' (xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

ATG CCG AGC CTC CCG GCC CCG CCG GCC CCG CTG CTG CTC CTC GGG CTG 

45 Met Pro Ser Leu Pro Ala Pro Pro Ala Pro Leu Leu Leu Leu Gly Leu 

1 5 10 15 



48 
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CTG 


CTG 


CTC 


GGC 


TCC 




Leu 


Leu 


Leu 


Gly 

20 


Ser 




GTG 


CTG 


CCC 


ATC 


CGT 


5 


Val 


Leu 


Pro 

35 


lie 


Arg 




GCA 


GGC 


TGC 


ACC 


TTC 




Ala 


Gly 

50 


Cys 


Thr 


Phe 


10 


CAC 


CCG 


GAC 


CTA 


GGG 




His 

65 


Pro 


Asp 


Leu 


Gly 




GCC 


TGC 


GAG 


GCG 


CCT 


15 


Ala 


Cys 


Glu 


Ala 


Pro 

85 




GTC 


AGC 


TGC 


AAG 


AAC 




Val 


Ser 


Cys 


Lys 

100 


Asn 




CAG 


CCG 


CGC 


CAG 


CTG 


20 


Gin 


Pro 


Arg 

115 


Gin 


Leu 




CGC 


AGC 


AGT 


TCG 


GAG 




Arg 


Ser 
13 0 


Ser 


Ser 


Glu 


25 


CGG 


GAC 


CCG 


GAG 


CAT 




Arg 

145 


Asp 


Pro 


Glu 


His 




GAG 


GAG 


CGG 


GCC 


CGT 


30 


Glu 


Glu 


Arg 


Ala 


Arg 

165 




ACA 


GGG 


CCG 


AGG 


TCG 




Thr 


Gly 


Pro 


Arg 

180 


Ser 




CGC 


TCT 


AGC 


CTC 


CGC 


35 


Arg 


Ser 


Ser 

195 


Leu 


Arg 




ACC 


AGG 


ATC 


CGC 


TTC 




Thr 


Arg 

210 


lie 


Arg 


Phe 


40 


CCT 


GCA 


GCC 


CCC 


ACC 




Pro 

225 


Ala 


Ala 


Pro 


Thr 




GTG 


CCT 


CGG 


TTG 


TCT 


45 


Val 


Pro 


Arg 


Leu 


Ser 

245 




GCA 


CTT 


GTG 


ACA 


CTC 




Ala 


Leu 


Val 


Thr 

260 


Leu 







- 


2- 






CGG 


CCG 


GCC 


CGC GGC 


GCC 


GGC 


Arg 


Pro 


Ala 


Arg Gly 
25 


Ala Gly 


TCT 


GAG 


AAG 


GAG CCG 


CTG 


CCC 


Ser 


Glu 


Lys 

40 


Glu Pro 


Leu 


Pro 


GGC 


GGG 


AAG 


GTC TAT 


GCC 


TTG 


Gly 


Gly 

55 


Lys 


Val Tyr 


Ala 


Leu 

60 


GAG 


CCA 


TTC 


GGG GTG 


ATG 


CGC 


Glu 

70 


Pro 


Phe 


Gly Val 


Met 

75 


Arg 


CAG 


TGG 


GGT 


CGC CGT 


ACC 


AGG 


Gin 


Trp 


Gly 


Arg Arg 
90 


Thr 


Arg 


ATC 


AAA 


CCA 


GAG TGC 


CCA 


ACC 


lie 


Lys 


Pro 


Glu Cys 
105 


Pro 


Thr 


CCG 


GGA 


CAC 


TGC TGC 


CAG 


ACC 


Pro 


Gly 


His 

120 


Cys Cys 


Gin 


Thr 


CGG 


CAG 


CCG 


AGC GGC 


CTG 


TCC 


Arg 


Gin 

135 


Pro 


Ser Gly 


Leu 


Ser 
14 0 


CGC 


AGT 


TAT 


AGC GAC 


CGC 


GGG 


Arg 

150 


Ser 


Tyr 


Ser Asp 


Arg Gly 
155 


GGT 


GAC 


GGC 


CAC ACG 


GAC 


TTC 


Gly 


Asp 


Gly 


His Thr 
170 


Asp 


Phe 


CAG 


GCG 


GTG 


GCA.CGA 


GCC 


CGA 


Gin 


Ala 


Val 


Ala Arg 
185 


Ala 


Arg 


TTC 


TCT 


ATC 


TCC TAC 


AGG 


CGG 


Phe 


Ser 


lie 

200 


Ser Tyr 


Arg Arg 


TCA 


GAC 


TCC 


AAT GGC 


AGT 


GTC 


Ser 


Asp 

215 


Ser 


Asn Gly 


Ser 


Val 

220 


CAA 


GAT 


GGC 


CTG GTC 


TGT 


GGG 


Gin 

230 


Asp 


Gly 


Leu Val 


Cys Gly 
235 


CTG 


CGG 


CTC 


CTT AGG 


GCA 


GAA 


Leu 


Arg 


Leu 


Leu- Arq 
• 250 


Ala 


Glu 


ACT 


CAC 


CCT 


TCA : GGG 


GAG 


GTC 


Thr 


His 


Pro 


Ser Gly 
265, 


Glu 


Val 



CCA GAG CCC CCC 96 

Pro Glu Pro Pro 

30 

GTT CGG GGA GCG 144 

Val Arg Gly Ala 

45 

GAC GAG ACG TGG 192 

Asp Glu Thr Trp 



TGC GTG CTG TGC 240 

Cys Val Leu Cys 

80 

GGC CCT GGC AGG 288 

Gly Pro Gly Arg 

95 

CCG GCC TGT GGG 336 

Pro Ala Cys Gly 

110 

TGC CCC CAG GAG 384 

Cys Pro Gin Glu 

125 

TTC GAG TAT CCG 432 

Phe Glu Tyr Pro 



GAG CCA GGC GCT 480 

Glu Pro Gly Ala 

160 

GTG GCG CTG CTG 528 

Val Ala Leu Leu 

175 

GTC TCG CTG CTG 576 

Val Ser Leu Leu 

190 

CTG GAC CGC CCT 624 

Leu Asp Arg Pro 

205 

CTG TTT GAG CAC 672 

Leu Phe Glu His 



GTG TGG CGG GCA 720 

Val Trp Arg Ala 
240 

CAG CTG CAT GTG 768 

Gin Leu His Val 
255 

TGG GGG CCT CTC 
Trp Gly Pro Leu 
270 



816 
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ATC 


CGG 


CAC 


CGG 


GCC 


CTG 


GCT 


GCA 


GAG 


ACC 


TTC 


AGT 


GCC 


ATC 


CTG 


ACT 


864 




lie 


Arg 


His 


Arg Ala 


Leu 


Ala 


Ala 


Glu 


Thr 


Phe 


Ser 


Ala 


lie 


Leu 


Thr 








275 










280 










285 












CTA 


GAA 


GGC 


CCC 


CCA 


CAG 


CAG 


GGC 


GTA 


GGG 


GGC 


ATC 


ACC 


CTG 


CTC 


ACT 


912 


5 


Leu 


Glu 


Gly 


Pro 


Pro 


Gin 


Gin 


Gly 


Val 


Gly 


Gly 


lie 


Thr 


Leu 


Leu 


Thr 






290 










295 










300 














CTC 


AGT 


GAC 


ACA 


GAG 


GAC 


TCC 


TTG 


CAT 


TTT 


TTG 


CTG 


CTC 


TTC 


CGA 


GGG 


960 




Leu 


Ser 


Asp 


Thr 


Glu 


Asp 


Ser 


Leu 


His 


Phe 


Leu 


Leu 


Leu 


Phe 


Arg 


Gly 






305 










310 










315 








320 




10 


CTG 


CTG 


GAA 


CCC 


AGG 


AGT 


GGG 


GGA 


CTA 


ACC 


CAG 


GTT 


CCC 


TTG 


AGG 


CTC 


1008 




Leu 


Leu 


Glu 


Pro 


Arg 


Ser 


Gly 


Gly 


Leu 


Thr 


Gin 


Val 


Pro 


Leu 


Arg 


Leu 














325 










330 










335 








CAG 


ATT 


CTA 


CAC 


CAG 


GGG 


CAG 


CTA 


CTG 


CGA 


GAA 


CTT 


CAG 


GCC 


AAT 


GTC 


1056 




Gin 


lie 


Leu 


His 


Gin 


Gly 


Gin 


Leu 


Leu 


Arg 


Glu 


Leu 


Gin 


Ala 


Asn 


Val 




15 








340' 










345 








350 










TCA 


GCC 


CAG 


GAA 


CCA 


GGC 


TTT 


GCT 


GAG 


GTG 


CTG 


CCC 


AAC 


CTG 


ACA 


GTC 


1104 




Ser 


Ala 


Gin 


Glu 


Pro 


Gly 


Phe 


Ala 


Glu 


Val 


Leu 


Pro 


Asn 


Leu 


Thr 


Val 










355 










360 










365 












CAG 


GAG 


ATG 


GAC 


TGG 


CTG 


GTG 


CTG 


GGG 


GAG 


CTG 


CAG 


ATG 


GCC 


CTG 


GAG 


1152 


20 


Gin 


Glu 


Met 


Asp 


Trp 


Leu 


Val 


Leu 


Gly 


Glu 


Leu 


Gin 


Met 


Ala 


Leu 


Glu 








370 










375 










380 














TGG 


GCA 


GGC 


AGG 


CCA 


GGG 


CTG 


CGC 


ATC 


AGT 


GGA 


CAC 


ATT 


GCT 


GCC 


AGG 


1200 




Trp 


Ala 


Gly 


Arg 


Pro 


Gly 


Leu 


Arg 


lie 


Ser 


Gly 


His 


lie 


Ala 


Ala 


Arg 






385 










390 










395 










400 




25 


AAG 


AGC 


TGC 


GAC 


GTC 


CTG 


CAA 


AGT 


GTC 


CTT 


TGT 


GGG 


GCT 


GAT 


GCC 


CTG 


1248 




Lys 


Ser 


Cys 


Asp 


Val 


Leu 


Gin 


Ser 


Val 


Leu 


Cys 


Gly 


Ala 


Asp 


Ala 


Leu 














405 










410 








415 








ATC 


CCA 


GTC 


CAG 


ACG 


GGT 


GCT 


GCC 


GGC 


TCA 


GCC 


AGC 


CTC 


ACG 


CTG 


CTA 


1296 




He 


Pro 


Val 


Gin 


Thr 


Gly 


Ala 


Ala 


Gly 


Ser 


Ala 


Ser 


Leu 


Thr 


Leu 


Leu 




30 








420 










425 










430 










GGA 


AAT 


GGC 


TCC 


CTG 


ATC 


TAT 


CAG 


GTG 


CAA 


GTG 


GTA 


GGG 


ACA 


AGC 


AGT 


1344 




Gly 


Asn 


Gly 


Ser 


Leu 


lie 


Tyr 


Gin 


Val 


Gin 


Val 


Val 


Gly 


Thr 


Ser 


Ser 










435 










440 










445 












GAG 


GTG 


GTG 


GCC 


ATG 


ACA 


CTG 


GAG 


ACC 


AAG 


CCT 


CAG 


CGG 


AGG 


GAT 


CAG 


1392 


35 


Glu 


Val 


Val 


Ala 


Met 


Thr 


Leu 


Glu 


Thr 


Lys 


Pro 


Gin 


Arg 


Arg 


Asp 


Gin 






450 










455 










460 








CGC 


ACT 


GTC 


CTG 


TGC 


CAC 


ATG 


GCT 


GGA 


CTC 


CAG 


CCA 


GGA 


GGA 


CAC 


ACG 


1440 




Arg 


Thr 


Val 


Leu 


Cys 


His 


Met 


Ala 


Gly 


Leu 


Gin 


Pro 


Gly 


Gly 


His 


Thr 




465 










470 










475 






480 




40 


GCC 


GTG 


GGT 


ATC 


TGC 


CCT 


GGG 


CTG 


GGT 


GCC 


CGA 


GGG 


GCT 


CAT 


ATG 


CTG 


1488 




Ala 


Val 


Gly 


He 


Cys 


Pro 


Gly Leu 


Gly 


Ala Arg 


Gly 


Ala 


His 


Met 


Leu 












485 










490 










495 








CTG 


CAG 


AAT 


GAG 


CTC 


TTC 


CTG 


AAC 


GTG 


GGC 


ACC 


AAG 


GAC 


TTC 


CCA 


GAC 


1536 




Leu 


Gin 


Asn 


Glu 


Leu 


Phe 


Leu 


Asn 


Val 


Gly Thr 


Lys 


Asp 


Phe 


Pro 


Asp 


45 








500 










505 










510 








GGA 


GAG 


CTT 


CGG 


GGG 


CAC 


GTG 


GCT 


GCC 


CTG 


CCC 


TAC 


TGT 


GGG 


CAT 


AGC 


1584 




Gly 


Glu 


Leu 


Arg 


Gly 


His 


Val 


Ala 


Ala 


Leu 


Pro 


Tyr 


Cys 


Gly 


His 


Ser 








515 










520 










525 
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GCC 


CGC 


CAT 


GAC 


ACG 


CTG 


TCC 


GTG 


CCC 


CTA 


GCA 


GGA 


GCC 


CTG 


GTG 


CTA 


1632 




Ala 


Arg 


His 


Asp 


Thr 


Leu 


Ser 


Val 


Pro 


Leu 


Ala 


Gly 


Ala 


Leu 


Val 


Leu 








530 










535 










540 














CCC 


CCT 


GTG 


AAG 


AGC 


CAA 


GCA 


GCA 


GGG 


CAC 


GCC 


TGG 


CTT 


TCC 


TTG 


GAT 


1680 


5 


Pro 


Pro 


Val 


Lys 


Ser 


Gin 


Ala 


Ala 


Gly 


His 


Ala 


Trp 


Leu 


Ser 


Leu 


Asp 






545 










550 










555 








560 






ACC 


CAC 


TGT 


CAC 


CTG 


CAC 


TAT 


GAA 


GTG 


CTG 


CTG 


GCT 


GGG 


CTT 


GGT 


GGC 


1728 




Thr 


His 


Cys 


His 


Leu 


His 


Tyr Glu 


Val 


Leu 


Leu 


Ala 


Gly 


Leu 


Gly 


Gly 














565 










570 










575 




10 


TCA 


GAA 


CAA 


GGC 


ACT 


GTC 


ACT 


GCC 


CAC 


CTC 


CTT 


GGG 


CCT 


CCT 


GGA 


ACG 


1776 




Ser 


Glu 


Gin 


Gly 


Thr 


Val 


Thr 


Ala 


His 


Leu 


Leu 


Gly 


Pro 


Pro 


Gly 


Thr 












580 










585 








590 








CCA 


GGG 


CCT 


CGG 


CGG 


CTG 


CTG 


AAG 


GGA 


TTC 


TAT 


GGC 


TCA 


GAG 


GCC 


CAG 


1824 




Pro 


Gly 


Pro 


Arg 


Arg 


Leu 


Leu 


Lys 


Gly 


Phe 


Tyr 


Gly 


Ser 


Glu 


Ala 


Gin 




15 






595 










600 










605 












GGT 


GTG 


GTG 


AAG 


GAC 


CTG 


GAG 


CCG 


GAA 


CTG 


CTG 


CGG 


CAC 


CTG 


GCA 


AAA 


1872 




Gly 


Val 


Val 


Lys 


Asp 


Leu 


Glu 


Pro 


Glu 


Leu 


Leu 


Arg 


His 


Leu 


Ala 


Lys 








610 










615 










620 












GGC 


ATG 


GCC 


TCC 


CTG 


ATG 


ATC 


ACC 


ACC 


AAG 


GGT 


AGC 


CCC 


AGA 


GGG 


GAG 


1920 


20 


Gly 


Met 


Ala 


Ser 


Leu 


Met 


He 


Thr 


Thr 


Lys 


Gly 


Ser 


Pro 


Arg 


Gly 


Glu 






625 










630 










635 










640 






CTC 


CGA 


GGG 


CAG 


AGA 


CGA 


ACG 


GTG 


ATC 


TGT 


GAC 


CCG 


GTG 


GTG 


TGC 


CCA 


1968 




Leu 


Arg 


Gly 


Gin 


Arg 


Arg 


Thr 


Val 


lie 


Cys 


Asp 


Pro 


Val 


Val 


Cys 


Pro 














645 










650 










655 






25 


CCG 


CCC 


AGC 


TGC 


CCA 


CAC 


CCG 


GTG 


CAG 


GCT 


CCC 


GAC 


CAG 


TGC 


TGC 


CCT 


2016 




Pro 


Pro 


Ser 


Cys 


Pro 


His 


Pro 


Val 


Gin 


Ala 


Pro 


Asp 


Gin 


Cys 


Cys 


Pro 












660 










665 










670 










GTT 


TGC 


CCT 


GAG 


AAA 


CAA 


GAT 


GTC 


AGA 


GAC 


TTG 


CCA 


GGG 


CTG 


CCA 


AGG 


2064 




Val 


Cys 


Pro 


Glu 


Lys 


Gin 


Asp 


Val 


Arg 


Asp 


Leu 


Pro 


Gly 


Leu 


Pro 


Arg . 




30 






675 










680 










685 










AGC 


CGG 


GAC 


CCA 


GGA 


GAG 


GGC 


TGC 


TAT 


TTT 


GAT 


GGT 


GAC 


CGG 


AGC 


TGG 


2112 




Ser 


Arg 


Asp 


Pro 


Gly 


Glu 


Gly 


Cys 


Tyr 


Phe 


Asp 


Gly 


Asp 


Arg 


Ser 


Trp 








690 










695 










700 












CGG 


GCA 


GCG 


GGT 


ACG 


CGG 


TGG 


CAC 


CCC 


GTT 


GTG 


CCC 


CCC 


TTT 


GGC 


TTA 


2160 


35 


Arg 


Ala 


Ala 


Gly 


Thr 


Arg 


Trp 


His 


Pro 


Val 


Val 


Pro 


Pro 


Phe 


Gly 


Leu 






705 










710 










715 








720 





ATT AAG TGT GCT GTC TGC ACC TGC AAG GGG GGC ACT GGA GAG GTG CAC 2208 

lie Lys Cys Ala Val Cys Thr Cys Lys Gly Gly Thr Gly Glu Val His 
725 730 735 



40 TGT GAG AAG GTG CAG TGT CCC CGG CTG GCC TGT GCC CAG CCT GTG CGT 2256 

Cys Glu Lys Val Gin Cys Pro Arg Leu Ala Cys Ala Gin Pro Val Arg 

740 745 750 

GTC AAC CCC ACC GAC TGC TGC AAA CAG TGT CCA GTG GGG TCG GGG GCC 2304 

Val Asn Pro Thr Asp Cys Cys Lys Gin Cys Pro Val Gly Ser Gly Ala 

755 760 765 

CAC CCC CAG CTG GGG GAC CCC ATG CAG GCT GAT GGG CCC CGG GGC TGC 2352 

His Pro Gin Leu Gly Asp Pro Met Gin Ala Asp Gly Pro Arg Gly Cys 

770 775 780 
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CGT ITT GCT GGG CAG TGG TTC CCA GAG AGT CAG AGC TGG CAC CCC TCA 2400 

Arg Phe Ala Gly Gin Trp Phe Pro Glu Ser Gin Ser Trp His Pro Ser 

785 790 795 800 

GTG CCC CCT TTT GGA GAG ATG AGC TGT ATC ACC TGC AGA TGT GGG GCA 2448 

5 Val Pro Pro Phe Gly Glu Met Ser Cys lie Thr Cys Arg Cys Gly Ala 

805 810 815 

GGG GTG CCT CAC TGT GAG CGG GAT GAC TGT TCA CTG CCA CTG TCC TGT 2496 

Gly Val Pro His Cys Glu Arg Asp Asp Cys Ser Leu Pro Leu Ser Cys 

820 825 830 

10 GGC TCG GGG AAG GAG AGT CGA TGC TGT TCC CGC TGC ACG GCC CAC CGG 2544 

Gly Ser Gly Lys Glu Ser Arg Cys Cys Ser Arg Cys Thr Ala His Arg 

835 840 845 

CGG CCA GCC CCA GAG ACC AGA ACT GAT CCA GAG CTG GAG AAA GAA GCC 2592 

Arg Pro Ala Pro Glu Thr Arg Thr Asp Pro Glu Leu Glu Lys Glu Ala 

15 850 855 860 

GAA GGC TCT TAGGGAGCAG CCAGAGGGCC AAGTGACCAA GAGGATGGGG CCTGAGCTG 2650 

Glu Gly Ser 

865 

GGGAAGGGGT GGCATCGAGG ACCTTCTTGC ATTCTCCTGT GGGAAGCCCA GTGCCTTTGC 
20 2710 

TCCTCTGTCC TGCCTCTACT CCCACCCCCA CTACCTTTGG GAACCACAGC TCCACAA GGG 
2770 

GGAGAGGCAG CTGGGCCAGA CCGAGGTCAC AGCCACTCCA AGTCCTGCCC TGCCACCCTC 
2830 

25 GGCCTCTGTC CTTGGAAGCC CCACCCCTTT CCTCCTGTAC ATAATGTCAC TGGCTTGTTG 
2890 

GGATTTTTAA TTTATCTTCA CTCAGCACCA AGGGCCCCCG ACACTCCACT CCTGCTGCCC 
2950 

CTGAGCTGAG CAGAGTCATT ATTGGAGAGT TTTGTATTTA TTAAAACATT TCTTTTTCAG 
30 3010 

TCAAAAAAAA AAAAAAAGGG CGGCCGC 
3037 



( 2 ) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

35 (A) LENGTH: 867 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; protein 
(v) FRAGMENT TYPE: internal 

40 (xi> SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Pro Ser Leu Pro Ala Pro Pro Ala Pro Leu Leu Leu Leu Gly Leu 
1.5 10 15 

Leu Leu Leu Gly Ser Arg Pro Ala Arg Gly Ala Gly Pro Glu Pro Pro 
20 ‘ 25- 30 

45 Val Leu Pro lie Arg Ser Glu Lys Glu Pro Leu Pro Val Arg Gly Ala 
35 40 45 

Ala Gly Cys Thr Phe Gly Gly Lys Val Tyr Ala Leu Asp Glu Thr Trp 
50 55 60 

His Pro Asp Leu Gly Glu Pro Phe Gly Val Met Arg Cys Val Leu Cys 
65 70 75 80 

Ala Cys Glu Ala Pro Gin Trp Gly Arg Arg Thr Arg Gly Pro Gly Arg 
85 90 95 
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Val 


Ser 


Cys 


Lys 

100 


Asn lie 


Lys Pro 


Glu 

105 


Cys 


Pro 


Thr 


Pro Ala 
110 


Cys 


Gly 




Gin 


Pro 


Arg 

115 


Gin 


Leu Pro 


Gly His 
120 


Cys 


Cys 


Gin 


Thr 


Cys Pro 
125 


Gin 


Glu 


5 


Arg 


Ser 

130 


Ser 


Ser 


Glu Arg 


Gin Pro 
13 5 


Ser 


Gly 


Leu 


Ser 

140 


Phe Glu 


Tyr 


Pro 




Arg 

145 


Asp 


Pro 


Glu 


His Arg 
150 


Ser Tyr 


Ser 


Asp 


Arg 

155 


Gly 


Glu Pro 


Gly 


Ala 

160 


10 


Glu 


Glu 


Arg 


Ala 


Arg Gly 
165 


Asp Gly 


His 


Thr 

170 


Asp 


Phe 


Val Ala 


Leu 

175 


Leu 




Thr 


Gly 


Pro 


Arg 

180 


Ser Gin 


Ala Val 


Ala 

185 


Arg 


Ala 


Arg 


Val Ser 
190 


Leu 


Leu 




Arg 


Ser 


Ser 

195 


Leu 


Arg Phe 


Ser lie 
200 


Ser 


Tyr 


Arg 


Arg 


Leu Asp 
205 


Arg 


Pro 


15 


Thr 


Arg 

210 


lie 


Arg 


Phe Ser 


Asp Ser 
215 


Asn 


Gly 


Ser 


Val 

220 


Leu Phe 


Glu 


His 




Pro 

225 


Ala 


Ala 


Pro 


Thr Gin 
230 


Asp Gly 


Leu 


Val 


Cys 

235 


Gly 


Val Trp 


Arg 


Ala 

240 


20 


Val 


Pro 


Arg 


Leu 


Ser Leu 
245 


Arg Leu 


Leu 


Arg 

250 


Ala 


Glu 


Gin Leu 


His 

255 


Val 




Ala 


Leu 


Val 


Thr 

260 


Leu Thr 


His Pro 


Ser 

265 


Gly 


Glu 


Val 


Trp Gly 
270 


Pro 


Leu 




lie 


Arg 


His 

275 


Arg 


Ala Leu 


Ala Ala 
280 


Glu 


Thr 


Phe 


Ser 


Ala lie 
285 


Leu 


Thr 


25 


Leu 


Glu 

290 


Gly 


Pro 


Pro Gin 


Gin Gly 
295 


Val 


Gly 


Gly 


lie 

300 


Thr Leu 


Leu 


Thr 




Leu 

305 


Ser 


Asp 


Thr 


Glu Asp 
310 


Ser Leu 


His 


Phe 


Leu 

315 


Leu 


Leu Phe 


Arg 


Gly 

320 


30 


Leu 


Leu 


Glu 


Pro 


Arg Ser 
325 


Gly Gly 


Leu 


Thr 

330 


Gin 


Val 


Pro Leu 


Arg 

335 


Leu 




Gin 


lie 


Leu 


His 

340 


Gin Gly 


Gin Leu 


Leu 

345 


Arg 


Glu 


Leu 


Gin Ala 
350 


Asn 


Val 




Ser 


Ala 


Gin 

355 


Glu 


Pro Gly 


Phe Ala 
360 


Glu 


Val 


Leu 


Pro 


Asn Leu 
365 


Thr 


Val 


35 


Gin 


Glu 

370 


Met 


Asp 


Trp Leu 


Val Leu 
375 


Gly 


Glu 


Leu 


Gin 

380 


Met Ala 


Leu 


Glu 




Trp 

385 


Ala 


Gly 


Arg 


Pro Gly 
390 


Leu Arg 


lie 


Ser 


Gly 

395 


His 


lie Ala 


Ala 


Arg 

400 


40 


Lys 


Ser 


Cys 


Asp 


Val Leu 
405. 


Gin Ser 


Val 


Leu 

410 


Cys 


Gly 


Ala Asp 


Ala 

415 


Leu 




He 


Pro 


Val 


Gin 

420 


Thr Gly 


Ala Ala 


Gly 

425 


Ser 


Ala 


Ser 


Leu Thr 
430 


Leu 


Leu 




Gly 


Asn 


Gly 

435 


Ser 


Leu lie 


Tyr Gin 
440 


Val 


Gin 


Val 


Val 


Gly Thr 
445 


Ser 


Ser 


45 


Glu 


Val 

450 


Val 


Ala 


Met Thr 


Leu Glu 
455 


Thr 


Lys. 


Pro 


Gin 

460 


Arg Arg 


Asp 


Gin 




Arg 

465 


Thr 


Val 


Leu 


Cys His 
470 


Met Ala 


Gly 


Leu 


Gin 

475 


Pro 


Gly Gly 


His 


Thr 

480 


50 


Ala 


Val 


Gly 


lie 


Cys Pro 
485 


Gly Leu 


Gly 


Ala 

490 


Arg 


Gly 


Ala His 


Met 

495 


Leu 




Leu 


Gin 


Asn 


Glu 

500 


Leu Phe 


Leu Asn 


Val 

505 


Gly 


Thr 


Lys 


Asp Phe 
510 


Pro 


Asp 




Gly 


Glu 


Leu 

515 


Arg 


Gly His 


Val Ala 
520 


Ala 


Leu 


Pro 


Tyr 


Cys Gly 
525 


His 


Ser 


55 


Ala 


Arg 

530 


His 


Asp 


Thr Leu 


Ser Val 
535 


Pro 


Leu 


Ala 


Gly 

540 


Ala Leu 


Val 


Leu 




Pro 

545 


Pro 


Val 


Lys 


Ser Gin 
550 


Ala Ala 


Gly 


His 


Ala 

555 


Trp 


Leu Ser 


Leu 


Asp 

560 


60 


Thr 


His 


Cys 


His 


Leu His 
565 


Tyr Glu 


Val 


Leu 

570 


Leu 


Ala 


Gly Leu 


Gly 

575 


Gly 




Ser 


Glu 


Gin 


Gly 

580 


Thr Val 


Thr Ala 


His 

585 


Leu 


Leu 


Gly 


Pro Pro 
590 


Gly 


Thr 




Pro 


Gly 


Pro 

595 


Arg 


Arg Leu 


Leu Lys 
600 


Gly 


Phe 


Tyr 


Gly 


Ser Glu 
605 


Ala 


Gin 
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Gly 


Val 

610 


Val 


£ 

m 


Asp 


Leu 


Glu 

615 


Pro 


Glu 


Leu Leu Arg 
620 


His Leu Ala 


Lys 




Gly 

625 


Met 


Ala 


Ser 


Leu 


Met 

630 


He 


Thr 


Thr 


Lys Gly Ser 
635 


Pro Arg Gly 


Glu 

640 


5 


Leu 


Arg 


Gly 


Gin 


Arg 

645 


Arg 


Thr 


Val 


lie 


Cys Asp Pro 
650 


Val Val Cys 
655 


Pro 




Pro 


Pro 


Ser 


Cys 

660 


Pro 


His 


Pro 


Val 


Gin 

665 


Ala Pro Asp 


Gin Cys Cys 
670 


Pro 


10 


Val 


Cys 


Pro 

675 


Glu 


Lys 


Gin 


Asp 


Val 

680 


Arg 


Asp Leu Pro 


Gly Leu Pro 
685 


Arg 




Ser 


Arg 

690 


Asp 


Pro 


Gly 


Glu 


Gly 

695 


Cys 


Tyr 


Phe Asp Gly 
700 


Asp Arg Ser 


Trp 




Arg 

705 


Ala 


Ala 


Gly 


Thr 


Arg 

710 


Trp 


His 


Pro 


Val Val Pro 
715 


Pro Phe Gly 


Leu 

720 


15 


lie 


Lys 


Cys 


Ala 


Val 

725 


Cys 


Thr 


Cys 


Lys 


Gly Gly Thr 
730 


Gly Glu Val 
735 


His 




Cys 


Glu 


Lys 


Val 

740 


Gin 


Cys 


Pro 


Arg 


Leu 

745 


Ala Cys Ala 


Gin Pro Val 
750 


Arg 


20 


Val 


Asn 


Pro 

755 


Thr 


Asp 


Cys 


Cys 


Lys 

760 


Gin 


Cys Pro Val 


Gly Ser Gly 
765 


Ala 




His 


Pro 

770 


Gin 


Leu 


Gly 


Asp 


Pro 

775 


Met 


Gin 


Ala Asp Gly 
780 


Pro Arg Gly 


Cys 




Arg 

785 


Phe 


Ala 


Gly 


Gin 


Trp 

790 


Phe 


Pro 


Glu 


Ser Gin Ser 
795 


Trp His Pro 


Ser 

800 


25 


Val 


Pro 


Pro 


Phe 


Gly 

805 


Glu 


Met 


Ser 


Cys 


lie Thr Cys 
810 


Arg Cys Gly 
815 


Ala 




Gly 


Val 


Pro 


His 

820 


Cys 


Glu 


Arg 


Asp 


Asp 

825 


Cys Ser Leu 


Pro Leu Ser 
830 


Cys 


30 


Gly 


Ser 


Gly 

835 


Lys 


Glu 


Ser 


Arg 


Cys 

840 


Cys 


Ser Arg Cys 


Thr Ala His 
845 


Arg 




Arg 

Glu 

865 


Pro 

850 

Gly 


Ala 

Ser 


Pro 


Glu 


Thr 


Arg 

855 


Thr 


Asp 


Pro Glu Leu 
860 


Glu Lys Glu 


Ala 



35 (2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 855 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

40 (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



Ala 

1 


Pro 


Pro 


Ala 


Pro 

5 


Leu 


Leu 


Leu 


Leu 


Gly 

10 


Leu 


Leu 


Leu Leu Gly 
15 


Ser 


Arg 


Pro 


Ala 


Arg 

20 


Gly 


Ala 


Gly 


Pro 


Glu 

25 


Pro 


Pro 


Val 


Leu Pro 
30 


lie 


Arg 


Ser 


Glu 


Lys 

35 


Glu 


Pro 


Leu 


Pro 


Val 

40 


Arg 


Gly 


Ala 


Ala 


Gly Cys 
45 


Thr 


Phe 


Gly 


Gly 

50 


Lys 


Val 


Tyr 


Ala 


Leu 

55 


Asp 


Glu 


Thr 


Trp 


His 

60 


Pro Asp 


Leu 


Gly 


Glu 

65 


Pro 


Phe 


Gly 


Val 


Met 

70 


Arg 


Cys 


Val 


Leu 


Cys 

75 


Ala 


Cys Glu 


Ala 


Pro 

80 


Gin 


Trp 


Gly 


Arg 


Arg 

85 


Thr 


Arg 


Gly 


Pro 


Gly 

90 


Arg 


Val 


Ser Cys 


Lys 

95 


Asn 


lie 


Lys 


Pro 


Glu 

100 


Cys 


Pro 


Thr 


Pro 


Ala 

105 


Cys 


Gly 


Gin 


Pro Arg 
110 


Gin 


Leu 


Pro 


Gly 


His 

115 


Cys 


Cys 


Gin 


Thr 


Cvs 

120 


Pro 


Gin 


Glu 


Arg 


Ser Ser 
125 


Ser 


Glu 



55 
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Arg Gin 
130 


Pro 


Ser Gly 


Leu Ser 
135 


Phe 


Glu 


Tyr 


Pro 


Arg 
14 0 


Asp 


Pro 


Glu 


His 




Arg Ser 
145 


Tyr 


Ser Asp 


Arg Gly 
150 


Glu 


Pro 


Gly 


Ala 

155 


Glu 


Glu 


Arg 


Ala 


Arg 

160 


5 


Gly Asp 


Gly 


His Thr 
165 


Asp Phe 


Val 


Ala 


Leu 

170 


Leu 


Thr 


Gly 


Pro 


Arg 

175 


Ser 




Gin Ala 


Val 


Ala Arg 
180 


Ala Arg 


Val 


Ser 

185 


Leu 


Leu 


Arg 


Ser 


Ser 

190 


Leu 


Arg 


10 


Phe Ser 


lie 

195 


Ser Tyr 


Arg Arg 


Leu 

200 


Asp 


Arg 


Pro 


Thr 


Arg 

205 


lie 


Arg 


Phe 




Ser Asp 
210 


Ser 


Asn Gly 


Ser Val 
215 


Leu 


Phe 


Glu 


Hi3 


Pro 

220 


Ala 


Ala 


Pro 


Thr 




Gin Asp 
225 


Gly 


Leu Val 


Cys Gly 
230 


Val 


Trp 


Arg 


Ala 

235 


Val 


Pro 


Arg 


Leu 


Ser 

240 


15 


Leu Arg 


Leu 


Leu Arg 
245 


Ala' Glu 


Gin 


Leu 


His 

250 


Val 


Ala 


Leu 


Val 


Thr 

255 


Leu 




Thr His 


Pro 


Ser Gly 
260 


Glu Val 


Trp 


Gly 

265 


Pro 


Leu 


lie 


Arg 


His 

270 


Arg 


Ala 


20 


Leu Ala 


Ala 

275 


Glu Thr 


Phe Ser 


Ala 

280 


lie 


Leu 


Thr 


Leu 


Glu 

285 


Gly 


Pro 


Pro 




Gin Gin 
290 


Gly 


Val Gly 


Gly lie 
295 


Thr 


Leu 


Leu 


Thr 


Leu 

300 


Ser 


Asp 


Thr 


Glu 




Asp Ser 
305 


Leu 


His Phe 


Leu Leu 
310 


Leu 


Phe 


Arg 


Gly 

315 


Leu 


Leu 


Glu 


Pro 


Arg 

320 


25 


Ser Gly 


Gly 


Leu Thr 
325 


Gin. Val 


Pro 


Leu 


Arg 

330 


Leu 


Gin 


lie 


Leu 


His 

335 


Gin 




Gly Gin 


Leu 


Leu Arg 
340 


Glu Leu 


Gin 


Ala 

345 


Asn 


Val 


Ser 


Ala 


Gin 

350 


Glu 


Pro 


30 


Gly Phe 


Ala 

355 


Glu Val 


Leu Pro 


Asn 

360 


Leu 


Thr 


Val 


Gin 


Glu 

365 


Met 


Asp 


Trp 




Leu val 
370 


Leu 


Gly Glu 


Leu Gin 
375 


Met 


Ala 


Leu 


Glu 


Trp 

380 


Ala 


Gly 


Arg 


Pro 




Gly Leu 
385 


Arg 


lie Ser 


Gly His 
390 


lie 


Ala 


Ala 


Arg 

395 


Lys 


Ser 


Cys 


Asp 


Val 

400 


35 


Leu Gin 


Ser 


Val Leu 
405 


Cys Gly 


Ala 


Asp 


Ala 

410 


Leu 


lie 


Pro 


Val 


Gin 

415 


Thr 




Gly .Ala 


Ala 


Gly Ser 
420 


Ala Ser 


Leu 


Thr 

425 


Leu 


Leu 


Gly 


Asn 


Gly 

430 


Ser 


Leu 


40 


lie Tyr 


Gin 

435 


Val Gin 


Val Val 


Gly 

440 


Thr 


Ser 


Ser 


Glu 


Val 

445 


Val 


Ala 


Met 




Thr Leu 
450 


Glu 


Thr Lys 


Pro Gin 
455 


Arg 


Arg 


Asp 


Gin 


Arg 

460 


Thr 


Val 


Leu 


Cys 




His Met 
465 


Ala 


Gly Leu 


Gin Pro 
470 


Gly 


Gly 


His 


Thr 

475 


Ala 


Val 


Gly 


lie 


Cys 

480 


45 


Pro Gly 


Leu 


Gly Ala 
4B5 


Arg Gly 


Ala 


His 


Met 

490 


Leu 


Leu 


Gin 


Asn 


Glu 

495 


Leu 




Phe Leu 


Asn 


Val Gly 
500 


Thr Lys 


Asp 


Phe 

505 


Pro 


Asp 


Gly 


Glu. 


Leu 

510 


Arg 


Gly 


50 


His Val 


Ala 

515 


Ala Leu 


Pro Tyr 


Cys 

520 


Gly 


His 


Ser 


Ala 


Arg 

525 


His 


Asp 


Thr 




Leu Ser 
530 


Val 


Pro Leu 


Ala Gly 
535 


Ala 


Leu 


Val 


Leu 


Pro 

540 


Pro 


Val 


Lys 


Ser 




Gin Ala 
545 


Al a 


Gly His 


Ala Trp 
550 


Leu 


Ser 


Leu 


Asp 

555 


Thr 


His 


Cys 


His 


Leu 

560 


55 


His Tyr 


Glu 


Val Leu 
565 


Leu Val 


Gly 


Leu 


Gly 

570 


Gly 


Ser 


Glu 


Gin 


Gly 

575 


Thr 




Val Thr 


Ala 


His Leu 
580 


Leu Gly 


Pro 


Pro 

585 


Gly 


Thr 


Pro 


Gly 


Pro 

590 


Arg 


Arg 


60 


Leu Leu 


Lys 

595 


Gly Phe 


Tyr Gly 


Ser 

600 


Glu 


Ala 


Gin 


Gly 


Val 

605 


Val 


Lys 


Asp 




Leu Glu 
610 


Pro 


Glu Leu 


Leu Arg 
615 


His 


Leu 


Ala 


Lys 


Gly 

620 


Met 


Ala 


Ser 


Leu 




Met lie 
625 


Thr 


Thr Lys 


Gly Ser 
630 


Pro 


Arg 


Gly 


Glu 

635 


Leu 


Arg 


Gly 


Gin 


Arg 

640 


65 


Arg Thr 


Val 


lie Cys 


Asp Pro 


Val 


Val 


Cys 


Pro 


Pro 


Pro 


Ser 


Cys 


Pro 
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645 650 655 

His Pro Val Gin Ala Pro Asp Gin Cys Cys Pro Val Cys Pro Glu Lys 

660 665 670 

Gin Asp Val Arg Asp Leu Pro Gly Leu Pro Arg Ser Arg Asp Pro Gly 

5 675 680 685 

Glu Gly Cys Tyr Phe Asp Gly Asp Arg Ser Trp Arg Ala Ala Gly Thr 

690 695 700 

Arg Trp His Pro Val Val Pro Pro Phe Gly Leu lie Lys Cys Ala Val 

705 710 715 720 

10 Cys Thr Cys Lys Gly Gly Thr Gly Glu Val His Cys Glu Lys Val Gin 

725 730 735 

Cys Pro Arg Leu Ala Cys Ala Gin Pro Val Arg Val Asn Pro Thr Asp 

740 745 750 

Cys Cys Lys Gin Cys Pro Val Gly Ser Gly Ala His Pro Gin Leu Gly 

15 755 760 765 

Asp Pro Met Gin Ala Asp Gly Pro Arg Gly Cys Arg Phe Ala Gly Gin 

770 775 780 

Trp Phe Pro Glu Ser Gin Ser Trp His Pro Ser Val Pro Pro Phe Gly 

785 790 795 800 

20 Glu Met Ser Cys He Thr Cys Arg Cys Gly Ala Gly Val Pro His Cys 

805 810. 815 

Glu Arg Asp Asp Cys Ser Leu Pro Leu Ser Cys Gly Ser Gly Lys Glu 

820 825 830 

Ser Arg Cys Cys Ser Arg Cys Thr Ala His Arg Arg Pro Ala Pro Glu 

25 835 840 • 845 

Thr Arg Thr Asp Pro Glu Leu 

850 855 



(2) INFORMATION FOR SEQ ID NO : 4 : 

{ i ) SEQUENCE CHARACTERISTICS : 

30 (A) LENGTH: 940 amino acids 

(B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

35 Gin Cys Pro Pro lie Leu Leu Val Trp Thr Leu Trp lie Met Ala Val 
15 10 15 

Asp Cys Ser Arg Pro Lys Val Phe Leu Pro lie Gin Pro Glu Gin Glu 
20 25 30 

Pro Leu Gin Ser Lys Thr Pro Ala Gly Cys Thr Phe Gly Gly Lys Phe 
40 35 40 45 

Tyr Ser Leu Glu Asp Ser Trp His Pro Asp Leu Gly Glu Pro Phe Gly 
50 55 60 

Val Met His Cys Val Leu Cys Tyr Cys Glu Pro Gin Arg Ser Arg Arg 
65 70 75 80 

45 Gly Lys Pro Ser Gly Lys Val Ser Cys Lys Asn lie Lys His Asp Cys 

85 90 95 

Pro Ser Pro Ser Cys Ala Asn Pro lie Leu Leu Pro Leu His Cys Cys 
100 105 110 

Lys Thr Cys Pro Lys Ala Pro Pro Pro Pro lie Lys Lys Ser Asp Phe 
50 115 120 125 

Val Phe Asp Gly Phe Glu Tyr Phe Gin Glu Lys Asp Asp Asp Leu Tyr 
130 . 135 140 

Asn Asp Arg Ser Tyr Leu Ser Ser Asp Asp Val Ala Val Glu Glu Ser 
145 150 155 160 

55 Arg Ser Glu Tyr- Val Ala Leu Leu Thr Ala Pro Ser His Val Trp Pro 

165 170 175 

Pro Val Thr Ser Gly Val Ala Lys Ala Arg Phe Asn Leu Gin Arg Ser 
180 185 190 
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Asn 


Leu 


Leu 

195 


Phe 


Ser 


lie 


Thr 


Tyr 

200 


Lys 


Trp 


lie 


Asp 


Arg Leu 
205 


Ser 


Arg 




He 


Arg 

210 


Phe 


Ser 


Asp 


Leu 


Asp 

215 


Gly 


Ser 


Val 


Leu 


Phe 

220 


Glu His 


Pro 


Val 


5 


His 

225 


Arg 


Met 


Gly 


Ser 


Pro 

230 


Arg 


Asp 


Asp 


Thr 


lie 

235 


Cys 


Gly He 


Trp 


Arg 

240 




Ser 


Leu 


Asn 


Arg 


Ser 

245 


Thr 


Leu 


Arg 


Leu 


Leu 

250 


Arg 


Met 


Gly His 


lie 

255 


Leu 


10 


Val 


Ser 


Leu 


Val 

260 


Thr 


Thr 


Thr 


Leu 


Ser 

265 


Glu 


Pro 


Glu 


lie Ser 
270 


Gly 


Lys 




lie 


Val 


Lys 

275 


His 


Lys 


Ala 


Leu 


Phe 

280 


Ser 


Glu 


Ser 


Phe 


Ser Ala 
285 


Leu 


Leu 




Thr 


Pro 

290 


Glu 


Asp 


Ser 


Asp 


Glu 

295 


Thr 


Gly 


Gly 


Gly 


Gly 

300 


Leu Ala 


Met 


Leu 


15 


Thr 

305 


Leu 


Ser 


Asp 


Val 


Asp 

310 


Asp 


Asn 


Leu 


His 


Phe 

315 


lie 


Leu Met 


Leu 


Arg 

320 




Gly 


Leu 


Ser 


Gly 


Glu 

325 


Glu 


Gly 


Asp 


Gin 


lie 

330 


Pro 


lie 


Leu Val 


Gin 

335 


lie 


20 


Ser 


His 


Gin 


Asn 

340 


His 


Val 


lie 


Arg 


Glu 

345 


Leu 


Tyr 


Ala 


Asn lie 
350 


Ser 


Ala 




Gin 


Glu 


Gin 

355 


Asp 


Phe 


Ala 


Glu 


Val 

360 


Leu 


Pro 


Asp 


Leu 


Ser Ser 
365 


Arg 


Glu 




Met 


Leu 

370 


Trp 


Leu 


Ala 


Gin 


Gly 

375 


Gin 


Leu 


Glu 


lie 


Ser 

380 


Val Gin 


Thr 


Glu 


25 


Gly 

385 


Arg 


Arg 


Pro 


Gin 


Ser 

390 


Met 


Ser 


Gly 


lie 


lie 

395 


Thr 


Val Arg 


Lys 


Ser 

400 




Cys 


Asp 


Thr 


. Leu 


Gin 

405 


Ser 


Val 


Leu 


Ser 


Gly 

410 


Gly 


Asp 


Ala' Leu 


Asn 

415 


Pro 


30 


Thr 


Lys 


Thr Gly 
420 


Ala 


Val 


Gly 


Ser 


Ala 

425 


Ser 


lie 


Thr 


Leu His 
430 


Glu 


Asn 




Gly 


Thr 


Leu 

435 


Glu 


Tyr 


Gin 


lie 


Gin 

440 


He 


Ala 


Gly 


Thr 


Met Ser 
445 


Thr 


Val 




Thr 


Ala 

450 


Val 


Thr 


Leu 


Glu 


Thr 

455 


Lys 


Pro 


Arg 


Arg 


Lys 

460 


Thr Lys 


Arg 


Asn 


35 


lie 

465 


Leu 


His 


Asp 


Met 


Ser 

470 


Lys 


Asp 


Tyr 


His 


Asd 

475 


Gly 


Arg Val 


Trp 


Gly 

480 




Tyr 


Trp 


lie 


Asp 


Ala 

485 


Asn 


Ala 


Arg 


Asp 


Leu 

490 


His 


Met 


Leu. Leu 


Gin 

495 


Ser 


40 


Glu 


Leu 


Phe 


Leu 

500 


Asn 


Val 


Ala 


Thr 


Lys 

505 


Asp 


Phe 


Gin 


Glu Gly 
510 


Glu 


Leu 




Arg 


Gly 


Gin 

515 


lie 


Thr 


Pro 


Leu 


Leu 

520 


Tyr 


Ser 


Gly 


Leu 


Trp Ala 
525 


Arg 


Tyr 




Glu 


Lys 

530 


Leu 


Pro 


Val 


Pro 


Leu 

535 


Ala 


Gly 


Gin 


Phe 


Val 

540 


Ser Pro 


Pro 


lie 


45 ' 


Arg 

545 


Thr 


Gly 


Ser 


Ala 


Gly 
• 550 


His 


Ala 


Trp 


Val 


Ser 

555 


Leu 


Asp Glu 


His 


Cys 

560 




His 


Leu 


His 


Tyr 


Gin 

565 


lie 


Val 


Val 


Thr 


Gly 

570 


Leu 


Gly 


Lys Ala 


Glu 

575 


Asp 


50 


Ala 


Ala 


Leu 


Asn 

580 


Ala 


His 


Leu 


His 


Gly 

585 


Phe 


Ala 


Glu 


Leu Gly 
590 


Glu 


Val 




Gly 


Glu 


Ser 

595 


Ser 


Pro 


Gly 


His 


Lys 

600 


Arg 


Leu 


Leu 


Lys 


Gly Phe 
605 


Tyr 


Gly 




Ser 


Glu 

610 


Ala 


Gin 


Gly 


Ser 


Val 

615 


Lys 


Asp 


Leu 


Asp 


Leu 

620 


Glu Leu 


Leu 


Gly 


55 


His 

625 


Leu 


Ser Arg 


Gly 


Thr 

630 


Ala 


Phe 


lie 


Gin 


Val 

635 


Ser 


Thr Lys 


Leu 


Asn 

640 




Pro 


Arg 


Gly Glu 


lie 

645 


Arg 


Gly 


Gin 


lie 


His 

650 


lie 


Pro 


Asn Ser 


Cys 

655 


Glu 


60 


Ser 


Gly 


Gly 


Val 

660 


Ser 


Leu 


Thr 


Pro 


Glu 

665 


Glu 


Pro 


Glu 


Tyr Glu 
670 


Tyr 


Glu 




lie 


Tyr 


Glu 

675 


Glu 


Gly 


Arg 


Gin 


Arg 

680 


Asp 


Pro 


Asp 


Asp 


Leu Arg 
685 


Lys 


Asp 




Pro 


Arg 

690 


Ala 


Cys 


Ser 


Phe 


Glu 

695 


Gly 


Gin 


Leu 


Arg 


Ala 

700 


His Gly 


Ser 


Arg 
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5 



10 



15 



20 



25 



Trp Ala 


Pro Asp Tyr Asp Arg 


Lys 


Cys 


Ser 


Val 


Cvs 


Ser 


Cys 


Gin 


Lys 


705 






710 








715 






720 


Arg Thr 


Val 


lie Cys 


Asp Pro 


He 


Val 


Cys 


Pro 


Pro 


Leu 


Asn 


Cys 


Ser 






725 








730 










735 




Gin Pro 


val 


His Leu 


Pro Asp 


Gin 


Cys 


Cys 


Pro 


Val 


Cys 


Glu 


Glu 


Lys 






740 






745 








750 




Lys Glu 


Met 


Arg Glu 


Val Lys 


Lys 


Pro 


Glu 


Arg 


Ala 


Arg Thr 


Ser 


Glu 




755 






760 










765 








Gly Cys 


Phe 


Phe Asp Gly Asp Arg 


Ser Trp 


Lys 


Ala 


Ala Gly 


Thr 


Arg 


770 






775 








780 








Trp His 


Pro 


Phe Val 


Pro Pro 


Phe Gly 


Leu 


He 


Lys 


Cys 


Ala 


He 


Cys 


785 






790 








795 






800 


Thr Cys 


Lys Gly Ser 


Thr Gly Glu 


Val 


His 


Cys 


Glu 


Lys 


Val 


Thr 


Cys 






805 








810 








815 


Pro Lys 


Leu 


Ser Cys 


Thr Asn 


Pro 


lie Arg 


Ala 


Asn 


Pro 


Ser 


Asp 


Cys 






820 






825 










830 


Cys Lys 


Gin Cys Pro 


Val Glu 


Glu Arg 


Ser 


Pro 


Met 


Glu 


Leu 


Ala 


Asp 




835 






840 










845 






Ser Met 


Gin 


Ser Asp 


Gly Ala Gly Ser 


Cys 


Arg 


Phe 


Gly Arg 


His 


Trp 


850 






855 










860 








Tyr Pro 


Asn 


His Glu Arg Trp 


His 


Pro 


Thr 


Val 


Pro 


Pro 


Phe 


Gly 


Glu 


865 






870 








875 








880 


Met Lys 


Cys 


Val Thr 


Cys Thr 


Cys 


Ala 


Glu Gly 


lie 


Thr 


Gin 


Cys 


Arg 






885 








890 










895 


Arg Gin 


Glu 


Cys Thr Gly Thr 


Thr 


Cys 


Gly Thr 


Gly 


Ser 


Lys 


Arg 


Asp 






900 






905 










910 


Arg Cys 


Cys 


Thr Lys 


Cys Lys 


Asp Ala Asn 


Gin 


Asp 


Glu Asp 


Glu 


Lys 




915 






920 










925 






Val Lys 


Ser Asp Glu 


Thr Arg 


Thr 


Pro 


Trp 


Ser 


Phe 










930 






935 








940 











30 
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